Calling all SRE's with a Top Secret Clearance in the San Antonio, TX area.
Seeking a Senior Site Reliability Engineer (SRE) with an active DoD Top Secret clearance to support a long-term contract with the USAF. In this role you will be responsible for the availability and reliability of critical platform services and applications, ensuring they meet the requirements of internal and external users.
The position is remote outside of training on-site during first 2 weeks and up to 1 week each quarter at Joint Base San Antonio (Lackland AFB). Our client is looking to onboard as a direct fulltime employee.
Responsibilities:
Ensure that DevSecOps principles are followed through the entire software delivery
lifecycle.
Responsible for the availability and reliability of critical platform services and applications,
ensuring they meet the requirements of internal and external users.
Be available to respond to incidents that impact the platform availability and provide
support for service engineers with customer incidents.
Build/run infrastructure with Chef, Ansible, Terraform, GitLab CI/CD, and Kubernetes.
Build monitoring that alerts on symptoms rather than on outages.
Document every action so your findings turn into repeatable actions and then into
automation.
Improve operational processes (such as deployments and upgrades) to make them as
boring as possible.
Design, build, and maintain core infrastructure that enables platform operations, scaling,
security, and reliability.
Debug production issues across services and levels of the stack.
- US Citizen and possess a Top Secret Security Clearance with SCI eligibilty.
- 4+ years of hands-on experience in a Senior Site Reliability Engineer role with:
o SRE principles, practices, tools, and automation
o Service Level Objectives (SLOs) & Error
o Budgets
o Monitoring and Service Level Indicators (SLIs)
o Anti-fragility, performance management, and
o incident management
o Automated testing
- 3 plus years of experience providing technical guidance to other engineers.
- Advanced understanding in agile and DevSecOps methodologies.
- Possess and applies a comprehensive knowledge across key tasks and high impact assignments.
- Effectively work with cross-functional teams to get a grasp of a product and/or programs overall state of health.
- Ability to evaluate performance results and recommend major changes affecting short-term project growth and success.
- Expert on the CI/CD process and the overall vision of the program.
- Experience collaborating with business leaders to build and run sustainable production systems.
- Must be certified in one or more of the following: CISSP, CASP+CE or CSSLP or able to
obtain within 30 days.
- Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer
(CKAD) certification is strongly preferred.
- Experience working with Platform One Big Bang is strongly preferred.
San Antonio, TX
1
Monday, June 17, 2024
Direct Hire
PERM
Tuesday, April 16, 2024
Know someone who would be a good fit? We pay for referrals!