O
OmiliaAI / Conversational Technology

Senior Data Architect

Greeceremotesenior

Posted 1mo ago · via Workable

About this role

Accountabilities Own the Training Environment data architecture end-to-end: dataset design and schema for all ML training pipelines, including dialog corpora for LLM training, conversational steps for NLU models, annotated evaluation sets, and whole-call recordings for speech-to-speech model development. Define and govern data selection and sampling strategy: establish criteria that determine which production conversations have the highest training value, including diversity-optimized sampling, confidence-based filtering, edge-case prioritization, and deduplication strategies. Build and maintain the data catalog and dataset discovery infrastructure: enable ML engineers across LLM, NLU, Speech, and Agentic teams to find, understand, and use training data without friction.…

Read the full description on Omilia's site →

What we'd score you on

reqspace match rubric

Five dimensions, recruiter-grade. Upload your resume and we'll generate a written explanation of where you fit and where the gaps are.

1

Skills match

For this role: python, sql, snowflake, aws, s3…

2

Level fit

This role is senior-level. We check your trajectory against it.

3

Domain experience

Your work in the role's domain matters more than your years total. We weight recent and direct experience.

4

Recency

A skill you used last quarter weighs more than one from five years ago. We grade on recency, not lifetime.

5

Location fit

This role is remote-eligible — we factor in your stated location and time-zone overlap.

Score yourself on this role.
Free · no card · written explanation included
See if I'm a fit →

Skills in this role

Pulled from the job description. These are the keywords we'll weight when scoring your fit.

pythonsqlsnowflakeawss3sagemakerairflowdbthuggingfaceteams

More at Omilia

See all open jobs at Omilia