In this paper, we propose a classification technique for Web pages, based on the detection of structural similarities among semistructured documents, and devise an architecture exploiting such technique for the purpose of information extraction. The …