Automated Recognition of Pornographic Content

Pornography is less straightforward to define than it may seem at first, since it is a high-level semantic category, not easily translatable in terms of simple visual characteristics. Though it certainly relates to nudity, pornography is a different concept: many activities which involve a high degree of body exposure have nothing to do with it. That is why systems based on skin detection often accuse false positives in contexts like beach shots or sports.

A commonly used definition is that pornography is the portrayal of explicit sexual matter with the purpose of eliciting arousal. This raises several challenges. First and foremost what threshold of explicitness must be crossed for the work to be considered pornographic? Some authors deal with this issue by further dividing the classes but this not only fall short of providing a clear cut definition, but also complicates the classification task. The matter of purpose is still more problematic, because it is not an objective property of the document.

The importance of pornography detection in visual documents is attested by the large literature on the subject. The vast majority of those works is based on the detection of human skin, and suffers from a high rate of false positives in situations of non-pornographic body exposure (like in sports). Some works use secondary criteria (like the shape of the detected skin areas, rejection of facial close-ups, etc.) to lower this rate.

Few methods have explored other possibilities. Bags of visual features have been employed for many complex visual classification tasks, including pornography detection in images and videos. Those works, however, have explored only bags of static features. Very few works have explored spatiotemporal features or other motion information for detection of pornography. Jansohn et al. 2009 (in "Detecting pornographic video content by combining image features with motion information") use bags of static visual features and analysis of motion, including motion histograms, as separate evidences, but does not consider bags of spatiotemporal features.

The NPDI research group (associated with Doctor Eduardo Valle -http://www.eduardovalle.com/) has been actively conducting experiments on automatedrecognition of pornographic scenes. In this context, we have recently proposed a framework for filtering of unwanted content that combines local spatiotemporal features, bag-of-visual-feature representation and supervised learning with support vector machines to tackle this problem. To find more information on this first work, we refer the reader to http://arxiv.org/abs/1101.2427 .


NPDI

Núcleo de Processamento Digital de Imagens.
Departamento de Ciência da Computação.
Instituto de Ciências Exatas.
Universidade Federal de Minas Gerais

 

Endereço

Av. Antônio Carlos 6627 - ICEx - sl. 3055
Pampulha - CEP 31270-010
Belo Horizonte - Minas Gerais - Brasil.

Telefone (31) 3409-5854