Junior Site Reliability Engineer

Strangelove is a primary developer and implementer within the Cosmos ecosystem. Strangelove contributes to core open source software enabling Cosmos infrastructure, provides advisory and implementation services to projects launching on Cosmos, and offers a full suite of infrastructure-as-a-service products. Additionally, Strangelove is an active venture investor in the Cosmos ecosystem through their partnership with Galileo, an early-stage crypto venture firm built to support the expanding Cosmos + Celestia IBC ecosystems.

If you are a system admin, site reliability engineer, or devops professional eager to learn more about networking, blockchains, cryptocurrency, kubernetes orchestration, and bare metal hosting, let’s talk!

When you join our team, you will be part of our first line of response for operational issues with k8s clusters, host servers, and the chains on which we validate transactions. This is an excellent opportunity to learn quickly with an experienced team operating at the leading edge of blockchain technology.

Responsibilities

  • Monitor and maintain observation systems and alert pipelines

  • Serve in the on-call rotation and respond to incidents

  • Contribute to the automation and maintenance of Strangelove’s infrastructure

  • Use Infrastructure-as-Code to minimize manual labor and overhead for scaled deployments

  • Continually improve security and follow best practices to minimize attack surfaces across our cloud infrastructure

  • Monitor external channels for chain upgrade notifications and coordinate with chain teams

  • Participate in governance voting for Strangelove and delegated funds

Requirements

You must have

  • 1+ years of experience maintaining Kubernetes clusters

  • 1+ years of experience deploying Infrastructure-as-Code; Terraform or Ansible is strongly preferred

  • 1+ years of experience with Docker and containerization

  • Solid understanding of K8s fundamentals such as pods, sidecars, vClusters, and Kustomize

  • Familiar with Grafana and Prometheus for visualization and alerts

  • Strong self-management skills and the ability to balance multiple competing priorities

  • Available some nights and weekends for on-call shifts

  • Proficient in spoken and written English

You should have a good mix of the following qualifications, but not necessarily all

  • Knowledge of shell scripting (bash, zsh, csh, sh, etc.)

  • Working knowledge of Fedora or Ubuntu distros

  • Experience with remote work and being on an all-remote team

  • Experience with ticketing systems and concepts

  • Experience with build pipelines

  • Understanding of network architecture, including firewalls, IPSec, network security, and DNS

  • Familiarity with site reliability engineering concepts such as service level agreements (SLAs), and service level indicators (SLIs)

Nice to Have

  • Experience with Golang, Rust or other server-side languages

  • Experience with a scripting language like Python

  • High-level understanding of how blockchains work (consensus, transactions, blocks, etc.)

  • Smart contract experience with Solidity or CosmWasm

  • Experience with modern software deployment practices

  • Customer service experience, particularly business-to-business support

Compensation

Competitive base salary

Health, dental, and vision benefits

Perks

  • Opportunity for career development in a fast paced emerging industry

  • Work with people who are passionate about what they do and also like to have fun

  • New laptop and continuing education budget

  • 100% remote with some internal travel opportunities

  • Unmetered paid time off


Reach out to careers @ strange.love with a cover letter and resume to apply.