Junior Site Reliability Engineer
Strangelove is a primary developer and implementer within the Cosmos ecosystem. Strangelove contributes to core open source software enabling Cosmos infrastructure, provides advisory and implementation services to projects launching on Cosmos, and offers a full suite of infrastructure-as-a-service products. Additionally, Strangelove is an active venture investor in the Cosmos ecosystem through their partnership with Galileo, an early-stage crypto venture firm built to support the expanding Cosmos + Celestia IBC ecosystems.
If you are a system admin, site reliability engineer, or devops professional eager to learn more about networking, blockchains, cryptocurrency, kubernetes orchestration, and bare metal hosting, let’s talk!
When you join our team, you will be part of our first line of response for operational issues with k8s clusters, host servers, and the chains on which we validate transactions. This is an excellent opportunity to learn quickly with an experienced team operating at the leading edge of blockchain technology.
Responsibilities
Monitor and maintain observation systems and alert pipelines
Serve in the on-call rotation and respond to incidents
Contribute to the automation and maintenance of Strangelove’s infrastructure
Use Infrastructure-as-Code to minimize manual labor and overhead for scaled deployments
Continually improve security and follow best practices to minimize attack surfaces across our cloud infrastructure
Monitor external channels for chain upgrade notifications and coordinate with chain teams
Participate in governance voting for Strangelove and delegated funds
Requirements
You must have
1+ years of experience maintaining Kubernetes clusters
1+ years of experience deploying Infrastructure-as-Code; Terraform or Ansible is strongly preferred
1+ years of experience with Docker and containerization
Solid understanding of K8s fundamentals such as pods, sidecars, vClusters, and Kustomize
Familiar with Grafana and Prometheus for visualization and alerts
Strong self-management skills and the ability to balance multiple competing priorities
Available some nights and weekends for on-call shifts
Proficient in spoken and written English
You should have a good mix of the following qualifications, but not necessarily all
Knowledge of shell scripting (bash, zsh, csh, sh, etc.)
Working knowledge of Fedora or Ubuntu distros
Experience with remote work and being on an all-remote team
Experience with ticketing systems and concepts
Experience with build pipelines
Understanding of network architecture, including firewalls, IPSec, network security, and DNS
Familiarity with site reliability engineering concepts such as service level agreements (SLAs), and service level indicators (SLIs)
Nice to Have
Experience with Golang, Rust or other server-side languages
Experience with a scripting language like Python
High-level understanding of how blockchains work (consensus, transactions, blocks, etc.)
Smart contract experience with Solidity or CosmWasm
Experience with modern software deployment practices
Customer service experience, particularly business-to-business support
Compensation
Competitive base salary
Health, dental, and vision benefits
Perks
Opportunity for career development in a fast paced emerging industry
Work with people who are passionate about what they do and also like to have fun
New laptop and continuing education budget
100% remote with some internal travel opportunities
Unmetered paid time off
Reach out to careers @ strange.love with a cover letter and resume to apply.