G Talent
AI QA Specialist (LLM Evaluation)
Quality Assurance, Quality Control
7M - ***M JPY
Tokyo
Full-time (permanent)
Online Interview:
Yes
Remote Work from Overseas:
No
/ From Japan:
No
Top-Selling Points
Rate of Foreigners
Less Than
Frequency of English Usage
Less Than
Top-Selling Points
Rate of Foreigners
Less Than
Less Than
Frequency of English Usage
Less Than
Less Than
Related Skills
Management/System project Experience
Others
Language Requirements
Required
- Japanese: Daily Conversational
- English: Business
Preferred
- No Data
Language Requirements
Required
Japanese:
Daily Conversational
English:
Business
Preferred
No Data
Job Description
Required Skills and Experiences
【Required】
Bachelor's degree in computer science, software engineering, artificial intelligence, machine learning, mathematics, physics, or a related field, or equivalent professional experience
3+ years of professional experience as a software engineer or QA engineer
Knowledge of evaluation methods for LLMs and generative AI (prompt evaluation, quantitative measurement of output quality, hallucination detection, etc.)
Basic knowledge of statistics and experimental design
Experience building evaluation pipelines in Python
Experience integrating tests into CI/CD pipelines
Experience designing prompt and tool regression tests
・Language proficiency: One of the following is required
Japanese: Fluent (able to discuss product development without misunderstandings) or
English: Business level
【Preferred】
Experience designing evaluation benchmarks for NLP and ML
Knowledge of AI safety and Responsible AI
Experience with red teaming and penetration testing
Experience evaluating multi-agent workflows, tool usage, and long-context scenarios
Experience with large-scale data processing (e.g., Spark, BigQuery)
Ability to read research papers and reproduce their results
Ability to communicate technically in English
Business
(description what they do)
Recruiting business specializing in job change support for foreign IT personnel
Available Jobs in (G Talent)
Senior Backend Engineer (Digital Twin)
5M JPY-***M JPY
AWSAzure
Backend Engineer (SaaS Application Development) *Node.js(TypeScript) / Golang
7M JPY-***M JPY
GoMachine LearningNode.js
(+2)
Information Security Management - Financial Business Launch
8M JPY-***M JPY
Others
Forward Deployed Engineer (FDE)
7M JPY-***M JPY
Python
Technical Project Manager (LLM New Business)
7M JPY-***M JPY
Big DataDeep LearningMachine Learning
(+2)
Product Manager (New Business Planning)
6M JPY-***M JPY
Others
Analytics Engineer
6M JPY-***M JPY
AWSBig DataSQL
Front End Engineer
5M JPY-***M JPY
JavaScriptTypeScriptVue.js
[Engineer]Open Positions (Tokyo/Osaka)
5M JPY-***M JPY
CSSHTMLJavaScript
(+2)
Backend Engineer (Scala)
6M JPY-***M JPY
AWSAzureJava
(+2)
See all
Recommended Jobs
G Talent
Product Manager (New Business Planning)
6M - ***M JPY | Tokyo
Others
Goofy inc.
Project Manager (Package and Middleware Systems)
4M - ***M JPY | Tokyo
Sales Force Others Project Management
G Talent
【ProdDev】Design Manager (SaaS Business)
7M - ***M JPY | Tokyo
Others
G Talent
Full-stack Specialist
9M - ***M JPY | Tokyo
Android AWS Docker
job
Company
G Talent
AI QA Specialist (LLM Evaluation)
Quality Assurance, Quality Control
7M - ***M JPY
Tokyo
Full-time (permanent)
Online Interview:
Yes
Remote Work from Overseas:
No
/ Remote Work in Japan:
No
Top-Selling Points
Rate of Foreigners
Less Than
Frequency of English Usage
Less Than
Top-Selling Points
Rate of Foreigners
Less Than
Less Than
Frequency of English Usage
Less Than
Less Than
Related Skills
Management/System project Experience
Others
Language Requirements
Required
- Japanese: Daily Conversational
- English: Business
Preferred
- No Data
Language Requirements
Required
Japanese:
Daily Conversational
English:
Business
Preferred
No Data
Job Description
Required Skills and Experiences
【Required】
Bachelor's degree in computer science, software engineering, artificial intelligence, machine learning, mathematics, physics, or a related field, or equivalent professional experience
3+ years of professional experience as a software engineer or QA engineer
Knowledge of evaluation methods for LLMs and generative AI (prompt evaluation, quantitative measurement of output quality, hallucination detection, etc.)
Basic knowledge of statistics and experimental design
Experience building evaluation pipelines in Python
Experience integrating tests into CI/CD pipelines
Experience designing prompt and tool regression tests
・Language proficiency: One of the following is required
Japanese: Fluent (able to discuss product development without misunderstandings) or
English: Business level
【Preferred】
Experience designing evaluation benchmarks for NLP and ML
Knowledge of AI safety and Responsible AI
Experience with red teaming and penetration testing
Experience evaluating multi-agent workflows, tool usage, and long-context scenarios
Experience with large-scale data processing (e.g., Spark, BigQuery)
Ability to read research papers and reproduce their results
Ability to communicate technically in English
Business
(description what they do)
Recruiting business specializing in job change support for foreign IT personnel
Available Jobs in (G Talent)
Senior Backend Engineer (Digital Twin)
5M JPY-***M JPY
AWSAzure
Backend Engineer (SaaS Application Development) *Node.js(TypeScript) / Golang
7M JPY-***M JPY
GoMachine LearningNode.js
(+2)
Information Security Management - Financial Business Launch
8M JPY-***M JPY
Others
Forward Deployed Engineer (FDE)
7M JPY-***M JPY
Python
Technical Project Manager (LLM New Business)
7M JPY-***M JPY
Big DataDeep LearningMachine Learning
(+2)
Product Manager (New Business Planning)
6M JPY-***M JPY
Others
Analytics Engineer
6M JPY-***M JPY
AWSBig DataSQL
Front End Engineer
5M JPY-***M JPY
JavaScriptTypeScriptVue.js
[Engineer]Open Positions (Tokyo/Osaka)
5M JPY-***M JPY
CSSHTMLJavaScript
(+2)
Backend Engineer (Scala)
6M JPY-***M JPY
AWSAzureJava
(+2)
See all
Recommended Jobs
bryza co.,ltd.
[Open position / If you are having trouble, click here! Software Design
3M - ***M JPY | Hokkaido(+47)
Others
G Talent
Product Manager (Ad Tech, Ad Fraud)
5M - ***M JPY | Tokyo
Others
G Talent
Full-stack Specialist
9M - ***M JPY | Tokyo
Android AWS Docker
G Talent
Technical Project Manager (LLM New Business)
7M - ***M JPY | Tokyo
Big Data Deep Learning Machine Learning