BigPanda's Blog

Get a Free BigPanda on TwitterFollow @bigpanda on twitter

Ansible is a great automation tool. We use it for server provisioning, application deployments and running maintenance scripts. One problem it does have however, is how (in)convenient it is to run playbooks as opposed to regular shell scripts. Write and run enough Ansible playbooks, and eventually you’ll get tired of the repetitive typing your fingers have to do.

Take for example a sample playbook.yml file. If you need to specify an inventory file and some variables, you can end up with:

ansible-playbook playbook.yml -i hosts -e var1=val1 -e var2=val2

And if you use vault it can be even longer:

A bigger team and a brand new office!

It's been a busy summer! At BigPanda, we're reinventing Incident Management for Ops and that takes a kickass team. Thanks to your enthusiasm for what we're doing, we've doubled since April (including a strategic hire directly from Atlassian's JIRA). To help accommodate all this growth, we just moved into our brand new office in Mountain View, CA. Want to join the BigPanda team? We'd love to have you! Email us at This email address is being protected from spambots. You need JavaScript enabled to view it. .

Last week, I changed the color of the GET A FREE ACCOUNT button on the BigPanda website and it resulted in a dramatic improvement in signups.

Data and the Science of Leadership

But it wasn’t my idea. It was the data’s idea. Data makes great decisions. We don’t. Any of us. Leadership is a science, not an art. The last decision you should make is to never make another decision. Data makes the only great decisions. Get out of its way. If you've ever met me you'll know that I have a strong passion for data and the science of leadership. 72% of the people that I tell this to want to hear more of the story. So here it is...

Modeling your production environment correctly is very important for development. Developers need to be able to run and test their code locally for the development process to be efficient, and many times this requires setting up infrastructure that exists in production on their local machines. The basic solution is a simple Vagrant box containing all your infrastructure and application code, like the one we mentioned in our Devbox post. But that's a basic everything-on-one-server setup. In production, you may have 3 database servers, 2 application servers and 2 caching servers. Pretending that a one-machine-to-rule-them-all model is accurate is… misleading. You can’t test scaling issues, catch race conditions, spot bad distributed design decisions etc until you reach production.

What if you could model clustered or distributed systems as multiple machines, you know, like they are in real life? While making it easy enough to customize so that the notoriously lazy developers actually use it? Without duplicating your production scripts? This post is about my solution to this problem using Vagrant and Ansible.

Monitoring applications in production has never been easier. With only a few code lines, you'll have New Relic installed and monitoring your application from nearly every angle. When something goes wrong, New Relic will start sending alerts. But then what? (hint – New Relic and BigPanda together is the answer)

You've been alerted and you need to take action. But in order to truly understand the incident, you first need to see those New Relic alerts in the context of alerts from your other monitoring systems like Nagios, OpsView, Icinga, Cloudwatch and the others. Next, it's important to be able to quickly correlate between alerts from the application stack to the server or network stack. You want to easily assign incidents to other members of your team so you can make sure that every incident has a clear owner and you can track it's progress until it has been resolved. Maybe above all, you want to stop using email to do all of the above.

This is where BigPanda comes in. BigPanda gives Modern Ops teams a dynamic, automated incident management solution. BigPanda cuts through the complexity of your monitoring stack and consolidates alerts from all of your monitoring systems. See incidents more clearly. And New Relic is a great place to start. Connecting New Relic and BigPanda is so easy, it’ll take you less than 5 minutes to complete. We promise!

BigPanda is an incident management platform for Modern Ops teams. Organize, prioritize and triage your incidents faster and more intelligently than ever before. Vastly improve your team's collaboration around Ops alerts and events. The following guide is the first in our series on getting started with BigPanda's incident feed. This product introduction will help you to get up and running quickly so you can get back to fixing the world's broken stuff.

Part 1  -  Getting Started with BigPanda: The Incident Feed
Part 2  - Getting Started with BigPanda: Incident Triage
Part 3  - Getting Started with BigPanda: Incident Analysis
Part 4  - Getting Started with BigPanda: Incident Assignment

The Incident Feed

The most important section of the BigPanda interface is the incident feed. The incident feed is always just a click away. Just click OpsBox in the menu at the top. Here you can track and manage all of your active incidents – no matter what system they're coming from: Nagios, New Relic, Pingdom, Zabbix, CloudWatch, Zenoss, and more than a dozen more – and the list is growing rapidly.

BigPanda is an incident management platform for modern Ops environments. With BigPanda, you will prioritize and route your incidents better and faster, while vastly improving your team's collaboration and processes. This is part 2 in a series on Getting Started with BigPanda. This guide will help you get up and running quickly and maximize the value you get out of the platform.

Part 1 - Getting Started with BigPanda: The Incident Feed
Part 2 - Getting Started with BigPanda: Incident Triage
Part 3 - Getting Started with BigPanda: Incident Analysis
Part 4 - Getting Started with BigPanda: Incident Assignment

Incident Triage

As we discussed in part 1 of this series, BigPanda automatically tackles the grouping of alerts into incidents by host, cluster, or application as they come in. Once that's done, the most important thing to do next is to decide on that incident's priority. This process is known as incident triage and it ensures that your team is channeling its efforts wisely. BigPanda gives you two easy-to-use ways to prioritize your work: snoozing and starring.

Get BigPanda for Free!

All of your Ops alerts from all of your monitoring systems in one place
and automatically clustered into incidents.

Get a Free Account!