xBerry Case studies Copysearcher

Copysearcher

a unique real-time plagiarism detector

Watch the video

We all love reading news and gossip about celebrities on the internet. From the users’ perspective, it’s pure pleasure combined with the speed and convenience of acquiring knowledge. But what about online publishers? Doing their job well requires devoting a lot of time and effort to providing the best quality content.

 

Unfortunately, online plagiarism increasingly leads to the unlawful dissemination of their work on other websites, causing direct losses for the authors of the content. So we’ve come to the rescue – introducing our innovative anti-plagiarism tool, detecting publishers’ bad practices in real-time.

Scroll pageg icon

Challenge

Copysearcher was created for one of our partners who specialised in creating content read by millions of users every day.

 

While running a news-based internet website, it’s crucial to reach the audience with the latest information that they will not find on other websites to keep their interest and loyalty. As we all know, journalism has its own rules. To do the job well, you need to devote a lot of effort to obtaining the necessary information, verify sources, and conduct interviews.

 

Unfortunately, even when providing the best quality content, sometimes it’s hard to keep it exclusive for your audience. Some publishers prefer to cut corners and copy others’ content, hoping to remain undetectable in the endless abyss of the web.

 

Plagiarism is one of the most common bad practices followed by online creators. Bad enough, it often doesn’t end with just inspiration – content theft is more popular than it might seem. Our partner was motivated to help fight over those incidents. So were we – with an innovative anti-plagiarism approach tailored to the needs of online content creators.

Goals

Our goal was clear – we wanted to create a tool that tracks content plagiarism in real-time, providing users with instant information about any entity that has stolen content.

We needed to use real-time web crawlers to search the web and index the content they find. While tracking a simple ‘copy-paste’ plagiarism might not be especially hard, detecting a paraphrased text or edited images and videos is a different story. We had to find a way to improve the accuracy of the system and a process for finding paraphrases and edited content.

Also, we wanted our tool to help publishers facilitate legal action to eliminate illegally duplicated content on other portals.

Solution

  • We used real-time web crawlers to search the web and index content.

  • Thanks to advanced AI and image processing techniques, Copysearcher can scan through more than 10,000 pages per hour.

  • We applied word embedding and deep neural networks for complex semantic text matching, providing recognition of copied and paraphrased pieces of text.

  • We applied object detection combined with validation tests for image recognition of copied and edited photos and video.

  • We added easy access to Google form 'Copyright Removal'.

  • To simplify legal steps we added the option of sending a direct email to the person related to the plagiarized content and their legal team.

Results

We developed a user-friendly, intuitive tool that lets a publisher track content plagiarisms on the web and enables further legal action.

 

Thanks to built-in easy access, Copysearcher allows the immediate use of Google’s ‘Copyright Removal’ option, as well as contact with the plagiarist or publisher’s legal team. Due to its high efficiency and accessibility, Copysearcher helped our partner protect their content better and maximize their influence by keeping their audience’s attention.

Tech Stack

KUBERNETES
DOCKER
PYTHON
FLASK
REACT
PYTORCH

What are the customers saying?

The MVP allowed the startup to secure a second round of funding, paving the way for a fully functional platform. xBerry R&D House was agile throughout the project. The team prioritized delivery and adjusted to scope changes with little to no friction.

Mariusz Szypura CEO Copysearcher

Planning a digital project?

Contact us Arrow icon