Software Engineer II – AI Infrastructure

Annapolis Junction, MD

Software Engineer II – AI Infrastructure

Location: Annapolis Junction, MD
Work Schedule: Full-Time, Onsite
Clearance Required: Active TS/SCI with Full Scope Polygraph (FSP)
Salary Range: $193,000 - $306,000

Overview

Join us in building the next generation of AI infrastructure that will power innovation across critical mission environments.

We are seeking an experienced Software Engineer II to support an advanced AI Infrastructure Team responsible for developing and maintaining the platform that serves as the foundation for enterprise AI capabilities. This role focuses on AI inference services while supporting a broader ecosystem of AI-enabled applications, including Retrieval-Augmented Generation (RAG), autonomous agents, and emerging machine learning technologies.

The ideal candidate is a highly skilled engineer who can independently design, build, deploy, and operate scalable infrastructure solutions while helping shape the future of AI adoption across mission-critical environments.

Key Responsibilities

AI Infrastructure & Platform Engineering

Design, implement, and optimize infrastructure supporting AI model inference at scale.
Develop, deploy, and maintain production AI services and applications.
Support emerging AI technologies, including:
- Retrieval-Augmented Generation (RAG)
- Agentic AI Systems
- Large Language Model (LLM) Platforms
- AI Inference Services
Build highly available, reliable, and scalable AI platform components.
Navigate ambiguous requirements and define practical, scalable technical solutions.

Cloud & Systems Engineering

Design and manage cloud-native infrastructure within AWS environments.
Automate infrastructure provisioning and configuration using Infrastructure-as-Code (IaC) principles.
Support Kubernetes deployments and administration across production environments.
Integrate systems across diverse platforms and technologies.
Optimize high-volume web applications and distributed systems for performance and reliability.

Observability & Operations

Implement monitoring, logging, and observability solutions across AI services and infrastructure.
Develop operational dashboards and alerting capabilities using:
- Grafana
- Prometheus
- OpenTelemetry
- Application Performance Monitoring (APM) tools
Support incident response, troubleshooting, and root cause analysis efforts.

DevOps & Automation

Develop and maintain CI/CD pipelines.
Improve deployment automation and operational efficiency.
Promote DevOps best practices across engineering teams.
Drive adoption of modern engineering tools and methodologies.

Security & Collaboration

Contribute to secure AI system design and implementation.
Support compliance with organizational security requirements.
Provide technical guidance and informal mentorship to junior engineers.
Collaborate with software engineers, data scientists, platform engineers, and mission stakeholders.

Required Qualifications

Education & Experience

Bachelor's degree in Computer Science, Software Engineering, Computer Engineering, Information Systems, or a related technical discipline.

Substitution:

Four (4) additional years of directly related experience may be substituted for a bachelor's degree.

Experience

Eight (8) or more years of software engineering experience.
Proven experience building and supporting production systems at scale.
Experience designing and supporting high-volume web applications.
Experience integrating complex systems across multiple technologies and platforms.
Experience supporting cloud-native infrastructure in AWS.
Experience administering and deploying applications within Kubernetes environments.

Technical Skills

Strong Python development skills.
AWS Cloud Engineering
Kubernetes
Infrastructure as Code (IaC)
CI/CD Pipelines
DevOps Methodologies
Monitoring and Observability Platforms
Distributed Systems Architecture
Performance Optimization
Systems Integration

Observability Technologies

Experience with one or more of the following:

OpenTelemetry
Grafana
Prometheus
Application Performance Monitoring (APM) Solutions

Professional Skills

Strong problem-solving and analytical abilities.
Ability to thrive in ambiguous and rapidly evolving environments.
Strong organizational influence and change management skills.
Excellent written and verbal communication skills.
Ability to work independently and collaboratively within highly technical teams.

Desired Qualifications

Candidates with one or more of the following qualifications are highly desired:

Experience with AI inference serving technologies such as:
- vLLM
- LiteLLM
- Similar inference platforms
Experience with agentic AI frameworks such as:
- LangChain
- LangGraph
- Similar orchestration frameworks
Experience with:
- Vector databases
- Embedding systems
- Semantic search technologies
Knowledge of:
- High-Performance Computing (HPC)
- Distributed Computing Systems
Experience supporting production AI/ML environments.

Compensation

Salary Range: $193,000 - $306,000

Compensation is based on experience, education, technical expertise, and overall alignment with program requirements.

Benefits

Medical Coverage

Choose from three comprehensive medical plans through Aetna. The company pays 80% of monthly premiums for employees.

Health Savings Account (HSA)

Pre-tax contributions for qualified medical expenses
Company contributes 50% of the annual deductible (prorated based on start date)

Dental Coverage

Aetna Passive PPO Max Plan
Company pays 80% of monthly premiums

Vision Coverage

Aetna Vision Preferred Premier 24M Plan
Company pays 80% of monthly premiums

Life Insurance

100% Company-Paid Life Insurance
Accidental Death & Dismemberment (AD&D) Coverage

Short-Term Disability

100% Company-Paid
Pays 60% of earnings up to $1,500 per week for up to 12 weeks

Retirement Plan

Automatic 6% employer contribution to 401(k)
Fully vested from day one
Employee contributions encouraged but not required

Paid Time Off & Holidays

5–6 weeks of PTO depending on tenure
11 paid holidays annually

Professional Development

$5,000 annual tuition reimbursement
Paid training, certifications, and industry conferences
Ongoing support for technical growth and career advancement

Why Join Us?

This is an opportunity to help shape the future of AI infrastructure while supporting critical mission objectives. You'll work alongside top-tier engineers building scalable AI platforms, deploying cutting-edge technologies, and solving some of the most challenging problems in modern software engineering.

https://www.staffed4u.com/