Bernhard Pollak, Wolfgang Gatterbauer: Creating Permanent Test Collections of Web Pages for Information Extraction Research. SOFSEM (2) 2007: 103-115