RebelMouse Technical Infrastructure, Security Overview, Disaster Recovery Plans and Escalation Processes
Technical Management - The Team
Paul Berry: CEO & Founder. Previously CTO of The Huffington Post where he handled scaling the site from 3 million monthly UVs to 145 million monthly UVs, with peaks of up to 200 thousand simultaneous users and 4 billion monthly page views.
Nike Gurin-Petrovych: CTO. Previously was the infrastructure and code lead at The Huffington Post. Together with Paul, Nike helped scale the site smoothly to handle massive traffic with “five 9's" of uptime for user facing and editorial tools.
Roberto Alamos Moreno leads security at RebelMouse, and worked in the same capacity at The Huffington Post. Before joining The Huffington Post, Roberto ran a security consulting company of his own.
RebelMouse's technical team also consists of two full time system administrators and twenty-four additional full time developers.
Host - Amazon Web Services
RebelMouse was built from the ground up on the AWS platform. We have built each component and layer to scale infinitely, with a goal of serving billions of unique views and page views seamlessly. We ask major partners with extremely high traffic to give us estimates for peak loads, but that isn't a requirement as we react and adapt dynamically to scale requests.
AWS Zone Management
In order to have the lowest possible latency we operate production in the North Virginia zone. But because Amazon historically has had moments of failing in one particular zone, we have slaves of all the data in the Oregon zone, across the coast so that all data is constantly replicated and within seconds or minutes we can have the entire site running from front ends in that zone. It is not constantly distributed because we have seen that impact latency but it is ready to run across two geographically disparate zones. We are happy to run live tests showing the move from one zone to another if that is helpful.
Frontends
- We have 4 frontends behind an Elastic Load Balancer, they are distributed across 2 different zones
- Each frontend is a c3.2xlarge instance
- 8 Cores 2.8Ghz
- 15GB RAM
- 2 SSD disks of 80GB each
- All frontends are running Varnish + Nginx
These 4 pairs of servers with nginx+varnish+uwsgi can handle 300,000 online
users on a single page because of smart and effective caching. We can add as many of these pairs as we need very quickly because we have automated this process. Supporting high growth of front-end requests from massive traffic peaks is handled elegantly because varnish sends requests in a matter of seconds. All pages and modules are highly cacheable and code has been carefully written and tested to ensure that this scales linearly.
Host - Amazon Web Services
RebelMouse was built from the ground up on the AWS platform. We have built each component and layer to scale infinitely, with a goal of serving billions of unique views and page views seamlessly. We ask major partners with extremely high traffic to give us estimates for peak loads, but that isn't a requirement as we react and adapt dynamically to scale requests.
Databases
Mysql
- 3 c3.4xlarge instances with master/slave configuration.
- 16 cores 2.8Ghz
- 30GB RAM
- 2 SSD disks of 160GB each
Redis
- 2 m2.4xlarge instances with master/slave configuration.
- 8 cores 2.6Ghz
- 68GB RAM
- 2 EBS disks of 250GB each
MongoDB
- 2 hi1.4xlarge instances with replicaset configuration.
- 16 cores 2.4Ghz
- 60GB RAM
- 2 SSD disks of 1024GB each in Raid0
- 1 m1.xlarge instance (4 cores 2.4Ghz - 15gb RAM) as a delayed member
- 1 m1.xlarge instance (4 cores 2.4Ghz - 15gb RAM) for non-critical
MongoDB (stats)
- 2 c3.2xlarge instances
- 8 Cores 2.8Ghz
- 15GB RAM
- 2 SSD disks of 80GB each
Our policy is to make sure that all database instances are underutilized so we have a lot of capacity in case of big events that send lots of dynamic requests into the data layer and/or custom requests from partners.
Security Culture
We believe that security is not just a one-time effort from an audit, but a continuous and never ending process that we are dedicated to constantly working on. The team working on this has high profile experience working on all the layers of security.
One example of a constant process that we are proud of supporting and committing development & system admin time to is a bounty program where we pay $100 for any valid (even minor, as long as it's valid) vulnerability found.
We work with Enterprise clients and their security/penetration testing teams to conduct the audits and react quickly whenever needed.
Security Details
We enforce SSL across all launches and work with clients to get the right certificates installed. Everything is balanced between 2 Amazon Zones, and 1 Zone is for backups. All RebelMouse email, redmine and administrative tools are multi-factor authenticated.
Security Specialists at RebelMouse
RebelMouse has a security team of two full-time developers dedicated to reviewing code and react to any flaws reported, and employs a freelance team of 25 white hat security specialists who are paid by valid security flaw they find.
Disaster Recovery Plans
Source Code Back Up: We have daily backups for all dev instances on s3. Plus we have code on frontends and a few more servers. Plus developers have copies on their local machines. So there is 0 chance we can loose code. Only if a developer drops a branch that only he/she had during a day so it won't appear in the backup.
Data is backed up separately depending on storage. For mongodb collections we use hot and delayed backups for each day from the latest week. For smaller things we do daily backups and store on s3. Plus we have hot backups in another Amazon AWS region.