Senior Platform Engineer
Hiire is supporting an early-stage, fast-growing company operating at the intersection of distributed systems, data infrastructure, and operational workflows.
They are building highly scalable, real-time platforms used in demanding and regulated environments, where reliability, observability, and operational excellence are critical. The engineering culture is centered around ownership, systems thinking, and solving complex infrastructure challenges with real business impact.
This is an opportunity to join a highly technical team building the foundations that enable product and AI teams to move quickly, safely, and at scale.
Position Overview
We’re looking for a Senior Platform Engineer to own and evolve the infrastructure powering a distributed, real-time platform. You’ll sit at the intersection of platform engineering, reliability, and developer experience, helping define the internal platform and operational standards that support the company’s growth.
This role goes beyond traditional infrastructure work. You’ll improve developer experience, own observability and incident response practices, and build the “paved roads” that allow engineering teams to ship reliably and efficiently.
The environment is particularly well suited to engineers who thrive in ambiguity, enjoy ownership end-to-end, and want to shape both a platform and the engineering culture behind it.
What you’ll do
• Build and evolve the internal platform infrastructure that enables engineering teams to deploy, test, monitor, and scale services independently
• Improve the software delivery lifecycle by developing reliable CI/CD workflows for both application and AI-related workloads
• Establish and maintain reliability standards across the platform through effective monitoring, alerting, and performance measurement practices
• Take an active role in incident response, troubleshooting production issues, and driving post-incident improvements focused on long-term stability
• Identify bottlenecks and operational risks across infrastructure and distributed systems, proactively implementing solutions to improve resilience and scalability
• Develop automation and internal tooling to streamline operations, reduce manual intervention, and improve platform efficiency
• Oversee and optimize cloud-native infrastructure running on GCP, including containerized and event-driven services
• Support and optimize large-scale search and indexing systems powered by Elasticsearch
• Enhance edge infrastructure, security, and traffic management through Cloudflare configuration and optimization
• Contribute to infrastructure cost optimization initiatives while maintaining strong performance and reliability standards
• Collaborate closely with Product, Engineering, and AI teams to support technical initiatives, platform requirements, and long-term architectural decisions
What you bring
• 5+ years of experience working as a Platform Engineer, Site Reliability Engineer, or Senior/Staff Software Engineer in production environments
• Strong hands-on experience with GCP or equivalent cloud platforms in scalable, high-availability systems
• Deep expertise with Kubernetes (GKE preferred), Terraform, and infrastructure-as-code practices
• Proven experience designing and maintaining CI/CD pipelines using tools such as GitHub Actions, Cloud Build, ArgoCD, or similar
• Strong understanding of modern observability practices, including metrics, logging, tracing, and alerting systems (OpenTelemetry or equivalent)
• Experience operating and supporting distributed systems with a strong focus on reliability, scalability, and operational excellence
• Comfortable participating in on-call rotations and leading incident management processes in production-critical environments
• Strong communication and collaboration skills
• Ability to operate autonomously and make sound technical decisions in fast-paced, ambiguous startup environments
• A systems-thinking mindset
Nice to Have
• Experience operating, scaling, and tuning Elasticsearch clusters in production environments
• Experience deploying and supporting LLMs or other ML models in production
• Familiarity with AI agent architectures, inference pipelines, or GPU-based workloads
• Experience working in regulated, security-sensitive, or high-compliance environments
Why join us
• The opportunity to tackle meaningful and technically challenging problems with real-world impact
• A highly collaborative environment where ownership, autonomy, and technical excellence are genuinely valued
• Flat hierarchy with direct access to the founders and strong influence on technical decisions
• Join the team onsite in Porto, working closely with engineering and leadership, this is hybrid role (2 days/week onsite)
• Private health insurance, plus sick and compassionate leave as needed
- Locations
- Porto
- Remote status
- Hybrid
Porto
About Hiire
We are based in Estonia and Cyprus, and we operate globally. We love to work remotely and travel the world. Having freedom, ownership and passion for recruitment are key for us.
We hire amazing tech talent for great companies
We coach highly motivated Professionals to get better careers
We empower recruiters to hire more and better Talent.
Do you want to help us to grow?