Facebook Pixel

NLP based content analysis and curation for classroom protection


Company description

Netop is a Danish software company that develops market-leading software solutions that connect people with computers and smart devices, using remote access, screen-sharing and video chat technologies. Millions of users count on Netop to make 100 million swift, secure and seamless connections every day.

Used by half of the Fortune 100, Netop’s solutions, including secure remote access and live chat, help businesses provide better customer service, reduce support costs and meet security and compliance standards. In education, Netop software is used by 9,000 schools (~75% in the US) and it connects more than 3 million teachers and students, helping schools transform education and improve learning outcomes with tools that make teaching with technology easier and more effective.

Project description

With Netop Classroom Management solutions, we strive to help teachers and students have a safe learning environment. Currently, we do this by enabling teachers to monitor better students’ behaviour with the features like viewing students’ screens, blanking them to focus the attention to the teacher or limiting web access to keep students safe & on-task. 

Over 75% of our Classroom Management tools are used at US-based schools, where cyberbullying, mass shooting and student suicides are highly sensitive problems. Thus, having access to monitor online student behaviour, we currently seek to improve our system in a way that it could help anticipate and prevent potential issues or even disasters. 

We are taking only the first steps towards this goal and we know it is going to be a long journey. We are starting our implementation by looking into the curation of URLs that contain harmful information so that they could be blocked. However, this is also not a very straight forward work. Not all of the websites are known and blacklisted so we need to have our own detection mechanism that could do the website text analysis. 

On top of that, some domains are legit, but only minor threads of ifs content might be harmful. E.g. Quora and Reddit are both legit websites that have niche harmful topics like “How to buy a gun in America" that should not be accessible to students. Therefore, we want to develop our Natural Language Processing (NLP) capabilities in our system to scrape and understand the content of the website or the page and continuously process and expand the list of websites and pages that are harmful. 

With your Master thesis project, we would like you together with our development team work on Natural Language Processing algorithm and content curation system development. Your primary task will be the algorithm selection, training and development while the team will support you with implementation.  After the algorithm will be trained to detect harmful content,  we expect to utilize these new capabilities in other safety assurance areas such as online bullying or monitoring other malicious behaviours online. It is also up for consideration Machine Learning type of feedback loop implementation for flagging content inappropriate by students or teachers and improving NLP and general detection system accuracy on a continuous basis.

To support you with NLP training and development we will provide access to publicly available data of 3m websites that belongs to blacklist repository. We also have access to an extensive library of malicious words that could be used for training. There is a number of other open-source datasets that could be used to further enhance NLP capabilities. If your preference is to go in-depth on a single topic and make a very accurate analysis of a given content, then scope could be narrowed down to just a single topic focus such as: self-harm, adult content or any other.

We would like you to:
  • Select the most useful algorithms for the task
  • Train NLP of a given data corpus including malicious words, websites and other relevant data
  • Test accuracy of malicious content detection 
  • Identify ways to continuously improve NLP accuracy with feedback loop implementation

Student description

We are looking for motivated Master Thesis student in Data Science, IT & Cognition or similar technical studies, who would be interested to work with NLP, machine learning, big data and other data science areas.  The student(s) should also have a rudimentary understanding and at least a basic knowledge of any relevant programming language (Python, Javascript, C, etc.) - it is not a specific requirement but could be advantageous in the thesis work. Also, understanding of US educational system might give some edge, too.

During this project collaboration, you will have support from the Lead technical person from Netop. We are open to both Danish and international students studying in Denmark or abroad.  The regular meetings will be held remotely, but if you are interested, there is an office space to come work from outside Copenhagen as well as we would be open to coordinate a trip to visit our Development team in Bucharest.

The applications will be reviewed on an ongoing basis and we will close it once the right candidate is found. For any questions – please contact Kris saikus@matchmythesis.com 

NLP based content analysis and curation for classroom protection | Match My Thesis
Jan 15, 2019



Content Curation



Machine Learning