PhD candidate, Computer Vision. Gamer.
25 December 2021
Fellowship.Ai is a four months long “unpaid” fellowship on various machine learning topics. The program is 100 % remote, and anyone can apply for it. (PS: Keep reading, if you should or not!). You can apply directly by submitting your resume, but if you complete one of the challenges mentioned on the website - you tend to increase your chances for getting to the interviews. The challenges are focused on different machine learning topics such as image segmentation, Natural Language Processing (NLP), One-Shot learning etc.
I applied for the second cohort of 2020 (Sept - Dec ‘20). I worked on the challenge problem of Image segmentation using a U-net model. You can read more about the project here - Image Matting with U-Net . On submitting the project, I got an interview call within next 2 days. The interview was fairly easily and mostly focused on basic of ML knowledge like F1 score, Confusion matrix, etc. and some intermediate python programming.
The program started with an orientation of all the accepted fellows for the cohort. You are free to choose a project based on your choice. Most of the projects during my term was either done in partnership with other companies such as Levis or GE or very new projects ( you will be the first to contribute to it ). Each project has about 5-6 fellows working on it. There are no mentors as such.
I was initially working on an ‘Emotion Detection’ project - where based on the video frames you have to classify if the person in the zoom meeting is engaged, non-engaged, animated or distracted. For the first month, we spent collecting the dataset from various internet scrapes. We were also provided with screenshot of internal zoom meeting. But the manual labelling has to be done on our own - which was indeed very boring task. If you are interested in such kind of project, you can start with some of the good dataset provided by kaggle - Eye Gaze Estimation, BIWI Kinect head pose.
I worked on creating a pipeline for the system. The input images were resized to 128x128 pixels.
If other entities are found from the first stage of the pipeline and is classified as non-engaged by the classifier – then it is regarded as Distracted.
- - - -
I was moved to an another project since most of team member expressed interest to work on other projects. Later, I was working on identifying ‘Fridge Food Type Detection’ . The project sounds easy but is very difficult to model. First, you don’t have a good pool of images to generate a high quality dataset. Second, identifying identifying individual food items is a daunting task, since objects are placed so close to each other in a fridge - it is hard to distinguish each of them. Here, our major work was to generate synthetic dataset using Generative Adversarial Networks(GANs). GANs have been pretty good lately is generating real-life like images of human faces. <show some of the results.
But, it turns out they are not so good at generating other objects. Before joining the group, other have already worked on using StyleGAN-2 by NVIDIA to generate synthetic in-domain images. The initial dataset was built using a member’s fridge’s in-built camera images and web scraping. Now, directly using GANs on the preliminary dataset was not successful since there were not many images. So, instead synthetic images were generated to transform them to in-domain images. To do so, a blender script was used to randomly place 3D-objects such as bottles, veggies etc. in a fridge 3D model(3D scene generation) and jpg images were produced form the front. Reference - Gregor et. al ICCV ‘19 ().
I implemented a Pix2Pix GAN method to create a mapping of synthetic images(from Blender) to generate in-domain images. Though Pix2Pix GAN is a paired GAN approach and we did not have pair of synthetic and in-domain fridge images. Instead, I created random pairs and fed it to a Pix2Pix GAN. After 200 epochs on about 1200 image pairs, results are still not convincing. You can check out the project here - Pix2Pix GAN.
Meanwhile, I started my Phd in November ‘20, and due to hectic schedule I had to focus more time to my PhD from December. But, there are appreciable results using CycleGAN. why so ? -
You can read the entire paper here: The Usage of CycleGAN for Image Translation to Increase the Size of Fridge Food Types Dataset
Job offers from partner companies are mostly confined to US residents. But there were externships available from the fellowship to fellows everywhere.
Fellowship does provide a platform for those intermediate in their learning curve of Machine Learning. Since, the program is unpaid, all the fellows(I met), were not working elsewhere full-time. So, I think if you are looking to either switch your career in an ML field or final year student(not the internship) - this will be a great opportunity to learn from real-world projects and potentially get job offers.
Comments
No comments found for this article.
Join the discussion for this article on this ticket. Comments appear on this page instantly.