The team is responsible for supporting our global Applications Platforms as an always available service. Additionally, the team implements, configures and automates the installation of Application Platforms across the enterprise for on premise data centers and the cloud.
In this role, your key area of focus will be enabling infrastructure automation and support for mission critical systems. The team’s goal is to provision environments through automation, industry standards and best practices. Key to success is the need for ensuring the health and monitoring of our mission critical systems.
You will work collaboratively with other Application Operations team members, Architects and Software Engineers in this endeavor.
Responsibilities will include the following:
- Identify, develop, implement and improve infrastructure/operations activities for automation with Cloud experience, particularly with AWS
- Proactively monitor, mitigate issues and administer a growing production footprint with diverse applications
- Troubleshoot and resolve issues in all environments through proven detail-oriented analysis in root cause scenarios and technical deep dives
- Contribute to team efforts to maintain processes and tools for infrastructure, monitoring and operations with clear documentation
- Create and refine tools and scripts to both monitor and measure application performance
- Implement strategies for optimization, high availability and recovery
- Ability to communicate well with multiple cross functional stakeholders
- Efficiently manage multiple work streams with clear and proactive communication of status, as both a self-starter and a team player.
- Strong Windows and intermediate Linux administration background
- Strong automation mindset based on past success, including ability to provide automation guidance to team members and developers.
- Automation Frameworks particularly Ansible, Terraform, and CloudFormation or similar “Infrastructure as Code” solutions (Chef, SaltStack etc.).
- Strong capabilities in scripting (Python, PowerShell, AWS CLI, Bash)
- AWS Support level experience with EC2, ELB, S3, VPC, CodeCommit/git, and Route53 DNS
- Application Monitoring experience, preferably including AWS CloudWatch.
- Experienced in troubleshooting, deep-dives, and debugging
- Strong understanding of infrastructure operations best practices, including experience in IT Operations in an highly available model
- Strong Web Server experience in IIS, Apache, or Tomcat
- Functional knowledge of Database concepts
- Comfortable operating in a matrix organization, strong ability to communicate across teams to ensure alignment of technology strategy, best practices and successful delivery.
- Understanding of SDLC processes
- Availability to support systems on-call after hours as needed
- Strong ability to learn and use new technologies and frameworks
- Understanding of High Availability services concepts
Additional Desired Qualifications
- AWS Certification (preferably, SysOps),
- AWS experience or functional knowledge in API Gateway, Lambda, DynamoDB, ACM, IAM
- Caching technologies (Redis)
- Experienced in Application Performance Management, preferably App Dynamics
- Experience using and supporting various Atlassian products, such as Jira, Confluence