About the job
At Sensorfact, our mission is to reduce the human environmental footprint by eliminating all waste in the industry. We help our customers gain real-time insights into their operations through high-resolution, machine-level sensors that monitor electricity, gas, water, compressed air, production speed, and machine health.
As a Data Scientist, you will play a key role in transforming raw sensor data into valuable insights. You will design heuristics and algorithms that enrich this data in real time, empowering customers to better understand and optimize their processes. On top of that, you will develop models that generate actionable and personalized advice. You’ll be involved in every step of the development process: from data quality checks for new sensor types to deploying and maintaining prediction pipelines. With data at the core of our platform, you will be a vital member of a cross-disciplinary team driving meaningful impact.
This is what you will be doing:You will enrich our sensor data using time series methodology (e.g. standby detection), by combining measurements from different sensors (e.g. machine efficiency) or transforming the raw measurements (e.g. fft based feature extraction).
You will be responsible for designing algorithms that turn our raw and enriched data into actionable advice in a scalable way, such as flagging early signs of machine degradation or estimating efficiency gains from machine upgrades or process changes.
You will set up tools that help our consultants interact with these models and provide feedback to them.
You will work closely with our Data Engineers to create deployable artifacts which will continuously generate insights across our customer base in a mix of streaming and batch applications.
We have many different skills and personalities in the team, and promote an open and collaborative working environment. You are encouraged to work together on complex tasks, to give feedback when you can and actively think about your personal development.
Being part of a fast-moving environment, you can work autonomously and are proactive in prioritising and solving the needs of our growing group of customers.
We do Scrum with 2-week sprints, sprint planning and retrospective sessions. We have daily stand ups over Google Meet. The course is determined by quarterly goals, set collaboratively by business, data, development and product teams.
We know how important it is to get in the zone and write beautiful code so we try to keep meeting pressure low. We work from home about 70% of the time, but we enjoy meeting each other in the office regularly.
Key technologies you will be working with:Our current AWS stack is focused on ingesting and processing raw sensor data in real time using Kafka and Flink. Processed sensor data is stored in Clickhouse, other types of data live in Postgres. Batch processing is done using Prefect and Fargate, on-demand services are deployed using Lambda. We have a powerful internal GraphQL API to expose data to end users, managed by Hasura. As a data scientist you will use Poetry for package management and Gitlab CI/CD to test, build and deploy our code. You can spin up a fresh Fargate cluster at any time and parallelize your workload using Dask.