This page was moved to Apify's student project ideas!!!

OLD STUFF

This page contains a list of ideas for paid student projects at Matfyz, including BSc and MSc theses. If you're interested in any of the projects, have your own idea, or would like to try an internship with us, just write to [email protected]. Even if you're from ČVUT. ;)

About Apify

We're on a mission to make the web more programmable. Apify provides a cloud infrastructure and tools that let people automate anything a person can do manually in a web browser, and run it at scale. In other words

Apify's systems process billions of web pages and hundreds of terabytes of data every month. Our stack is based on AWS, Linux, Node.js, MongoDB, and dozens of other services.

Apify was founded in 2016 by two friends who met during their studies at MFF UK. Currently, we're about 60 people, based in an office in Prague's Lucerna Palace. And moreover, about 20% of the company is from Matfyz. 🙂

Learn more at apify.com/about or https://apify.com/jobs.

Artificial intelligence-based projects

Fraudulent user detection

TL;DR: Some users create a lot of free accounts on Apify to get free computing resources or to test stolen credit cards. The goal of this project is to create an AI-based system capable of detecting these users.

Long description

The Apify platform gets a significant amount of new users, making it impractical to manually vet every single one of them. However, some of these users sign up with the intention of abusing our free trial by making many accounts, testing if credit card information they stole works or even using those stolen credit cards to pay for our services. The goal of this project is to create an AI-based system capable of detecting these users.

We already have some experience in automatically detecting fraudulent users, allowing us to provide you with some insights to make your beginnings on this project easier, along with a training dataset complete with correct answers for the detector to learn from.

JSON schema & anomaly detection

Let’s say we can extract structured data from web page to JSON. The goal of this project is to build a machine learning system that will detect changes in such JSONs over time, and automatically detect anomalies.

Captcha solver