Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
TV & Film
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/ea/e4/d9/eae4d946-94cb-ce16-aafe-51814facb7da/mza_4424733247194484186.jpg/600x600bb.jpg
Make it Work
Gerhard Lazu
15 episodes
2 days ago
Tech infrastructure that gets us excited. Conversations & screen sharing. 🔧 💻
Show more...
Technology
RSS
All content for Make it Work is the property of Gerhard Lazu and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Tech infrastructure that gets us excited. Conversations & screen sharing. 🔧 💻
Show more...
Technology
https://img.transistor.fm/_dcIgHaIsONY1IB4zWfkAxjTMyHBEKw3TQRJ8nkPY2U/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS82MDY3/OGNiOTU4Mzg1NjI2/ODEyNmRmZGY3N2Ix/ZWM0MS5wbmc.jpg
Keep Alert Chaos in Check
Make it Work
41 minutes
9 months ago
Keep Alert Chaos in Check

Today we talk with Matvey Kukuy and Tal Borenstein, co-founders of Keep, a startup focused on helping companies manage and make sense of their alert systems. The discussion comes three years after Matvey's previous appearance - https://shipit.show/36 - where he talked about Grafana Labs' acquisition of his previous startup Amixr (now Grafana OnCall).

Keep tackles a significant challenge in modern tech infrastructure: managing the overwhelming volume of alerts that companies receive from their various monitoring systems. Some enterprises deal with up to 70,000 alerts daily, making it crucial to identify which ones represent actual incidents requiring attention.

We explore real-world examples of major incidents, including the significant CrowdStrike outage in July 2024 that caused widespread system crashes and resulted in an estimated $10 billion in worldwide damages. This incident highlighted how critical it is to quickly identify and respond to serious issues among numerous alerts. Matvey tells us about his most black swan experience.

The episode concludes with a hint that some of Keep's AI features may eventually be released as open source once they're sufficiently polished.

LINKS

  • 🎧 Keep on-call simple
  • CrowdStrike - Wikipedia
  • 🎬 The Black Swan Theory
  • Keep Playground
  • Show HN: Keep - GitHub Actions for your monitoring tools

EPISODE CHAPTERS

  • (00:00) - What is new after three years?
  • (02:58) - Take us through the last memorable incident
  • (07:16) - My most black swan
  • (08:50) - How would have Keep made the CrowdStrike experience different?
  • (12:38) - How do companies end up in that place?
  • (15:29) - Keep name origin
  • (17:40) - Why would someone pick Keep?
  • (23:22) - Let's think about our use case
  • (25:03) - Demo ends
  • (28:21) - Reporting capabilities?
  • (30:25) - Deploying & running Keep
  • (33:12) - 2025 for Keep
  • (38:50) - Until next time
Make it Work
Tech infrastructure that gets us excited. Conversations & screen sharing. 🔧 💻