During the five-week virtual Hackathon, Artefact’s team impressed the judges by developing a NER (Named Entity Recognition) pipeline to detect brands in the beauty and cosmetics sector in Twitter posts with an integrated feedback loop.

A team of data scientists, ML engineers and data engineers from Artefact’s Paris office was awarded second place in the Hackathon hosted by Flyte and MLOps.community, which is an outstanding achievement within the MLOps Community expertise’s field:

  • MLOps.community is an open community that aims to fill the growing need to share real-world Machine Learning Operations best practices from engineers in the field
  • Flyte is an open-source, container-native, structured programming and distributed processing platform implemented in Golang

The virtual five-week hackathon consisted of creating an end-to-end ML application on Flyte as the MLOps platform. With the goal to add real-world value in production, the idea for the project could be based around any ML (machine learning) or Data application, such as retail-use cases, fraud protection, or computer vision. All projects were judged based on creativity, how well the team executed using the application, and how easy the model UI (user interface) was to understand.

Artefact’s experienced team composed of Senior Data Scientist / ML Engineer Amale El Harmri, Data Engineer Louis Rousselot de Saint Ceran, Senior Data Scientist Karim Si Larbi, Senior Data Scientist Hugo Vasselin, and Data Scientist Sascha Lasry, worked on this Hackathon in addition to their client and internal workload. During the competition, the team had the name “adorable-unicorns23.”

“Volunteering to participate in this Hackathon demonstrates our team’s commitment to our company’s values of collaboration and innovation. Whether it’s in the office or outside of the office, we share a passion for creating new things as a team.”
said Amale El Harmri, Senior Data Scientist / ML Engineer at Artefact.

Acknowledging that the beauty and cosmetics industry is in constant evolution, the team focused on a possible strategy to find indie, or independently owned, brands that are innovative and popular among the public and buy them out. Therefore, the team built a brand identification module on Twitter data flows that included a labelling interface at the stolen.

To complete the project, the team divided the process into three sections:

  • NER application workflow: consisting of scraping beauty-related tweets from Twitter and then extracting NER entities from post contents
  • Manual labelling part in Label Studio: taking the time to label the those same posts to check for any missed or incorrect entities 
  • NER training workflow: evaluating the NER model based on labellisation to either complete the workflow if successful, or train a new model with freshly labelled data if unsuccessful

“This was the first time any member from our team used Flyte, the team was able to submit tasks and workflows very fast, due to the platform’s intuitive SDK (Software Development Kit) and documentation. This Hackathon was an incredible experience for the team to demonstrate their advanced MLOps expertise!”
said Robin Doumerc, Staff ML Engineer at Artefact.

To view Artefact’s full online presentation of their project to the MLOps.community’s jury as part of MLOps #98, follow the link here and skip to the 41:22 time stamp.

Artefact Newsletter

Interested in Data Consulting | Data & Digital Marketing | Digital Commerce ?
Read our monthly newsletter to get actionable advice, insights, business cases, from all our data experts around the world!