📍 Annapolis Junction, MD
Salary: $160,000 - $230,000
Clearance: TS/SCI with Poly
We are seeking a Principal Software Engineer (SWE-3) with at least 7 years of experience and a strong background in Linux-based development, containerized environments, and modern AI/ML pipelines. This position supports a mission-focused team designing, developing, and deploying advanced Retrieval Augmented Generation (RAG) solutions in a high-performance computing (HPC) Linux environment.
As a member of the ML Frameworks team, you will contribute to the architecture and deployment of cutting-edge AI systems — including LLMs, orchestration frameworks, knowledge retrieval, and security-aware machine learning models.
Design, implement, and optimize scalable software solutions in HPC Linux environments.
Develop and deploy containerized services using Docker, PodMan, Kubernetes, and Docker Compose.
Implement CI/CD pipelines, version control, and monitoring solutions to ensure software reliability.
Integrate and optimize RAG pipelines, LLMs, and embedding models for mission use cases.
Collaborate with cross-functional teams to ensure robust and secure data flows.
Automate and improve processes across the development and deployment lifecycle.
Bachelor’s degree in Computer Science or related field + 7 years of relevant experience.
Proficiency with Linux system administration, CLI, and shell scripting.
Experience with containerization technologies (Docker, PodMan, containerd).
Background deploying services under Kubernetes or Docker Compose orchestration.
Recent hands-on software development experience with Python and/or Golang.
Familiarity with RAG pipelines, LLMs, and embedding models.
Experience implementing CI/CD pipelines and tools such as GitLab CI.
Experience with monitoring and metrics tools (Prometheus, Grafana).
Proficiency with Git Source Control.
Debugging and working with GPU-enabled applications.
Familiarity with LLM orchestration frameworks (e.g., Open API).
Experience with distributed processing (Spark, Dask, Ray) for ETL workflows.
Database expertise with SQL, Elasticsearch, and vector databases.
Experience with HTMX or Hyper-script.
Knowledge of multi-node, multi-GPU AI model training (HW/SW aspects).
Familiarity with AI inferencing solutions (Nvidia NIM/TRITON, vLLM, Ray).
Experience with Atlassian tools (Confluence, Jira).
We offer one of the strongest total compensation packages in the industry, including:
🏥 100% Employer-Paid Health, Dental, and Vision Insurance
💸 10% 401(k) Contribution with Zero Vesting
🏖️ 31 Days PTO + Federal Holidays
🎓 Student Loan Repayment Program
📚 Unlimited Certification Training
🏡 Remote Work & Flexible Scheduling Options
🎁 Multiple Incentive Bonuses
🤝 Team Bonding Events & Company Culture Initiatives