Forage AI
Website:
forage.ai
Job details:
Location: Remote (Work from Home)
About Forage AI
ForageAI builds next‑generation systems for data collection and processing — large‑scale web crawling, document parsing, data pipelines, and automation. We work primarily in Python, leverage cloud‑native designs (mainly AWS, with exposure to GCP/Azure), and increasingly apply GenAI and AI agents across our stack. Every developer owns their module and collaborates closely with peers in a high‑ownership, high‑trust environment.
Role Overview:
As a Jr. Software Engineer, you will work on software systems for data collection, processing, enrichment, and automation at scale. This is a hands-on engineering role where you will write production-quality code, debug real-world data problems, work with data pipelines, and gradually take ownership of modules.
You will also get opportunities to work with GenAI-based systems, LLM workflows, AI agents, and modern coding assistants. We encourage the use of coding co-pilots and AI tools, but with strong engineering discipline. You should be able to understand, review, test, and take full responsibility for the code you write or generate using these tools.
Key Responsibilities:
· Develop and maintain Python applications for crawling, parsing, enrichment, and processing of large datasets.
· Build and operate data workflows/Datapipelines, ETL/ELT, including validation, monitoring, and error‑handling.
· Work with SQL and NoSQL (plus vector databases/data lakes) for modeling, storage, and retrieval.
· Contribute to system design using cloud‑native components on AWS (e.g., S3, Lambda, ECS/EKS, SQS/SNS, RDS/DynamoDB, CloudWatch).
· Build LLM-based systems, RAG workflows, AI agents, and GenAI-enabled automation modules.
· Use coding co-pilots and AI development tools responsibly to improve productivity, while ensuring the code is understood, tested, secure, and maintainable.
· Implement and consume APIs/microservices; write clear contracts and documentation.
· Write unit/integration tests, perform debugging and profiling; contribute to code reviews and maintain high code quality.
· Implement observability (logging/metrics/tracing) and basic security practices (secrets, IAM, least privilege).
· Collaborate with Dev/QA/Ops; ship incrementally using PRs and design docs.
Required Qualifications
· 0.5–2 years of professional software engineering experience.
· Strong proficiency in Python; good knowledge of data structures/algorithms and basic software design principles.
· Hands‑on with SQL and at least one NoSQL store; familiarity with vector databases is a plus.
· Experience with web scraping frameworks (e.g., Scrapy, Selenium/Playwright, BeautifulSoup) and resilient crawling patterns (respect robots/rotations/retries).
· Practical understanding of system design and distributed systems basics.
· Exposure to AWS services and cloud‑native design; comfortable on Linux and with Git.
· GenAI & LLMs: experience with LangChain, CrewAI, LlamaIndex, prompt design, RAG patterns, and vector stores. (Candidates with this experience will be prioritized.)
Preferred / Good to Have (Prioritized)
· CI/CD & Containers: exposure to pipelines (GitHub Actions/Jenkins), Docker, and Kubernetes.
· Data Pipelines/Big Data: ETL/ELT, Airflow, Spark, Kafka, or similar.
· Infra as Code: Terraform/CloudFormation; basic cost‑ and performance‑optimization on cloud.
· Frontend/JS: not required; basic JS or frontend skills are a nice‑to‑have only.
· Exposure to GCP/Azure.
How We Work
· Ownership of modules end‑to‑end (design → build → deploy → operate).
· Clear communication, collaborative problem‑solving, and documentation.
· Pragmatic engineering: small PRs, incremental delivery, and measurable reliability.
Work‑from‑Home Requirements
· High‑speed internet for calls and collaboration.
· A capable, reliable computer (modern CPU, 16GB+ RAM).
· Headphones with clear audio quality.
· Stable power and backup arrangements.
ForageAI is an equal‑opportunity employer. We value curiosity, craftsmanship, and collaboration.
Click on Apply to know more.