Customer Story

Netlify Achieves Real-Time Observability at Scale with CrowdStrike Falcon LogScale

Netlify is a remote-first cloud computing company. Founded in 2014, the company provides a web development platform to connect, build and run high-performance sites, stores and apps for over three million developers and businesses, including Google, Facebook, Kubernetes, Samsung and Cisco. It offers a toolbox for front-end developers who want to move a website or web app from a monolith to a decoupled Jamstack architecture, supporting the development workflow from preview to production.

Hitting an Inflection Point with Logs

Too much popularity was the problem. Netlify had outgrown its old logging system, which was getting bogged down by the increased traffic and unable to meet the company’s growing needs. Netlify Principal Engineer Ryan Neal describes the realization that the old system was no longer working.

“We were hitting an inflection point of scale where we were getting a lot more popular, and a lot more log volume was coming through the system,” says Ryan. “Our current solution started to have problems returning inquiries in a timely manner.”

Ryan tried stitching together a custom solution using a number of different logging aggregation frameworks. As it began to consume more and more of his time, his manager took notice.

“My CTO came to me and said, ‘Look, I need you to build this network, this product, our stuff … not a logging aggregation framework,’” recalls Ryan.

Netlify had the additional requirement of implementing a log management solution that would support sales and customer support teams. With the previous solution, developers had to build custom queries to give these teams the insights they requested, which was a time-consuming endeavor.

If Netlify could implement a more user-friendly logging solution, everyone would benefit. “Support and sales would come to us with different requests, and we would often need to build tooling or dashboards, which slowed everything down,” explains Ryan.

The Winning POC

Netlify began an expansive search for a modern log management solution.

“We reached out to Splunk, Elasticsearch and pretty much any company that started with ‘log.’ The tech team ran several POCs, and CrowdStrike Falcon LogScale ended up winning out, thanks to its featureset capability,” says Ryan.

For Netlify, the winning feature of CrowdStrike Falcon LogScale is its ability to customize logs to accommodate the requests of other departments.

“The key factor was our ability to customize the tool,” says Ryan. “There’s a lot of contextual knowledge in our logs. Being able to share that knowledge via saved searches, dashboards and common queries enabled my operations team to run faster and helped our engineering team give sales and support the information they needed.”

The trial and testing phase wasn’t over yet. Netlify also needed to know it could answer the same questions as with its previous solution.

“We installed Filebeat and started piping data in parallel to Falcon LogScale. Not only did we realize we can answer the same questions, but we could actually gain more information from the solution,” says Ryan. “That really solidified our decision.”

Falcon LogScale helps us serve our users better by increasing uptime and reducing our mean time to discovery.
Ryan Neal, Principal Engineer
Netlify

A Positive Ripple Effect

Netlify implemented Falcon LogScale. Now, the company can affordably ingest all of its log data, empowering Ryan and his team to make decisions about changes to their system with more speed and confidence.

In addition, Netifly developers have configured Falcon LogScale to make logs accessible to other teams. If a customer reports an incident, for example, the support team can quickly access the logs to figure out what went wrong and resolve the issue.

Implementing Falcon LogScale has also improved the user experience for Netlify customers. Being able to log everything and search those logs in real time allows Netlify to respond quickly to outages, helping Netlify customers keep their systems up and running.

“Falcon LogScale helps us serve our users better by increasing uptime and reducing our mean time to discovery. In the event of an incident, we’re able to look in our metrics charts and see issues and problems. We see averages, percentiles, throughput rates dropping, latency spiking — things like that. And then we’re able to jump through the Falcon LogScale data and figure out, is that one client, is it a region? — any kind of problem,” says Ryan.

Netlify’s use of Falcon LogScale shows how a real-time, scalable log management solution can have positive ripple effects that spread across an organization and beyond to its customers.