Large language models OpenAI is trained on a very large set of data and extracts information from all over the internet.
What if LLMs wanted to explore the dark web? A team of South Korean researchers did exactly that, and the model They have built a new artificial intelligence called DarkBERT to index some of the top domains on the internet.
DarkBERT offers a fascinating glimpse into some of the darkest parts of the World Wide Web. The dark web is where illegal activities take place, from sharing hacked data to selling drugs.
Futurism Although DarkBERT looks like a nightmare at first glance, researchers say this artificial intelligence model has very good goals; trying to create new ways to fight cybercrime.
Unsurprisingly, understanding the parts of the web that are not indexed by search engines and often accessed through specialized software was not easy.
According to the article “DarkBERT: A Language Model for the Dark Side of the Internet”, the said model was first connected to the Tor network; The network used to access the dark web. In the next step, the model started its work and created a database of the received raw data.
The research team says their new large language model provided a much better description of the dark web than other models trained to do similar tasks.
The researchers wrote in part of their paper: “Our evaluation results show that the DarkBERT text classification model outperforms pre-trained language models.”