Sourcebae
Website:
sourcebae.com
Job details:
Role: LLM - AI Quality Analyst (Personalization) - English
Experience: 2 to 10+
Location: Remote
*Special reqs**
English Proficiency*: Ability to read and write in English with a high degree of comp, as English is the focus language for this project.
*Personal Account Usage*: Willingness to use your primary personal Google account (not a testing account) and enable personal data sources for a genuine assessment.
Schedule Flexibility: 7:30pm to 12:30 mandate login hours, rest hours flexible
*Key Qualifications*
English Proficiency: Ability to read and write in English with a high degree of comp, as English is the focus language for this project.
Personal Account Usage: Willingness to use your primary personal Google account (not a testing account) and enable personal data sources for a genuine assessment.
- Exceptional Analytical Thinking: Demonstrate ability to evaluate nuanced and ambiguous AI responses, specifically assessing personalization quality.
- Creative Prompt Engineering: Experience in designing creative, multi-turn starting prompts based on personal context to thoroughly test the model's capabilities.
- Strong Evaluation Acumen: Understanding of personalization concepts, including the ability to identify incorrect personalization, poor inferences, and forced connections.
- Meticulous Attention to Detail: The ability to review Side-by-Side (SxS) model responses and spot subtle differences in naturalness and overnarrating.
- Excellent Written Communication: Superior ability to write clear, concise, and structured rationales for model rankings, explicitly referencing specific turn numbers.
- Feedback: Ability to provide constructive feedback and detailed annotations.
- Communication: Excellent communication and collaboration skills.
- Independence: Self-motivated and able to work independently in a remote setting.
- Technical Setup: Desktop/Laptop set up with a good internet connection.
Description:
In this role, you will be part of a dynamic team focused on evaluating the quality of personalized AI interactions. Your day-to-day work will involve:
- Designing and executing multi-turn conversational prompts (typically 1-5 turns) that require the AI to utilize your personal information and experiences.
- Evaluating model responses based on your intent from the starting prompt, checking if the personalization was appropriately applied.
- Analyzing responses for Grounding issues, ensuring claims about you are supported by evidence and not flawed inferences or hallucinations.
- Assessing Integration quality to ensure personal data is woven naturally into the response without robotic "overnarrating".
- Rigorously evaluating and stack-ranking two model responses side-by-side (SxS) to determine which is overall more helpful, easy to use, and enjoyable.
- Writing clear, defensible rationales for your comparisons, explicitly referencing where issues or positive aspects occurred in the conversation.
- Extracting and verifying "Debug Info" from the model to confirm that chat summaries and data sources were properly utilized.
- Maintaining strict data hygiene by deleting evaluation conversations to prevent them from polluting your future chat history.
*Education & Exp
* BS/BA degree or equivalent experience in a relevant field (e.g., Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or a related analytical
field).Experience in data annotation, AI quality evaluation, content moderation, or a related role is strongly preferred.
If Intrested. Please submit your CV to Khushboo@Sourcebae.com or share it via WhatsApp at 8827565832
Stay updated with our latest job opportunities and company news by following us on LinkedIn: :https://www.linkedin.com/company/sourcebae
Click on Apply to know more.