Are you looking for your next big challenge?
For more than 90 years, Caterpillar Inc. has been making sustainable progress possible and driving positive change on every continent. Customers turn to Caterpillar to help them develop infrastructure, energy and natural resource assets. With 2018 sales and revenues of $54.7 billion, Caterpillar is the world’s leading manufacturer of construction and mining equipment, diesel and natural gas engines, industrial gas turbines and diesel-electric locomotives.
Caterpillar is investing in our digital future, and we’re looking for the best Site Reliability Engineers. Our iconic products have evolved from mechanical work horses to highly sophisticated, electronically-controlled worksite solutions. This transformation, along with our smart factories and our integrated dealer network, has a wealth of data ready to be leveraged by our customers and our dealers. Think you have what it takes to develop the software and architect the platform to support Caterpillar’s digital revolution?
We at Caterpillar Digital are working to put together the Digital platform for delivering industry-leading digital solutions in support of profitable growth for Caterpillar, dealers & our end customers.
Come join us in this exciting journey and be part of the world class organization and play a key role in its digital transformation.
Roles & Responsibilities:
Reliability in highly complex, integrated systems typically crosses between multiple programming languages, third-party services and integrations – as well as software and hardware – an Site Reliability Engineer needs to be multi-talented and who
• Thinks about systems - edge cases, failure modes, behaviours, specific implementations.
• Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it.
• Have an urge to collaborate and communicate asynchronously.
• Have an urge for delivering quickly and iterating fast.
As a Site Reliability Engineer , you will be a process , technology and results oriented team member for Operations to deliver top notch service , quality and metrics for Cat Digital data Platform.
Roles & responsibilities as below but not limited to
• Meeting SLO, SLA, SLI’s defined in the Operations model
• Setting task prioritization and troubleshoot to closure of incidents.
• Participate on-call /on-rotation.
• Improve Service observability.
• Proactively testing the flexibility and resilience of the system.
• Drive adoption of continuous integration/inspection/deployment
• Debug production issues across services and levels of the stack.
• Make monitoring and alerting alert on symptoms and not on outages.
- Bachelor’s degree, preferably in Computer Science, Software Engineering, or any other Engineering field.
- 4+ years with DevOps expertise.
- Knowledge of CI/CD solution on any platform with prior experience is must.
- Expertise in at least one technology stack designing, coding, testing, and delivering software.
- Working knowledge of Infrastructure components. (E.g. routers, load balancers, cloud products, container systems, compute, storage and networks).
- 4+ experience on Key AWS services: EC2, S3, VPC, Route 53, RDS, CloudFormation, EC2, DynamoDB (NoSQL), Lambda, logging/CloudWatch, IAM, Certificate Manager, ELB, EBS, ECS, CloudFront/WAF, SQS, SNS, SES.
- Knowledge on Azure Cloud an added advantage.
- Expertise in ELK Monitoring Tool that ensure Open Source IT monitoring, network monitoring, server and applications monitoring is an added advantage.
- 4+ years prior experience in DevOps and/or application development teams. Hands on experience using large scale software development, preferably in one of these languages: Java, Python, scripting languages is a must.
- Understanding of Restful API, Apigee or any other API Gateway will be plus.
- 4+ years experience on Docker and at least one Docker Container orchestration – ECS, Kubernet.
- Understanding with configuration Management tools like Ansible/Puppet/Chef/PowerShell/Terraform.
- Understanding of Git, Bitbucket, Jira, Jenkins, Sonar, Splunk, Maven, AIM and/ or Continuous Delivery tools.