RDF triples extraction from company web pages: comparison of state-of-the-art Deep Models

Wouter Baes; François Portet; Hamid Mirisaee; Cyril Labbé

Communication Dans Un Congrès Année : 2020

RDF triples extraction from company web pages: comparison of state-of-the-art Deep Models

(1, 2) , (1, 3) , (2) , (4)

1
2
3
4

Wouter Baes

Fonction : Auteur

Laboratoire d'Informatique de Grenoble

Skopai

François Portet

Fonction : Auteur
PersonId : 1069
IdHAL : francois-portet
ORCID : 0000-0003-2542-0661
IdRef : 098179160

Laboratoire d'Informatique de Grenoble

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Hamid Mirisaee

Fonction : Auteur

Skopai

Cyril Labbé

Fonction : Auteur
PersonId : 9675
IdHAL : cyril-labbe
ORCID : 0000-0003-4855-7038

Systèmes d’Information - inGénierie et Modélisation Adaptables

Résumé

Relation extraction (RE) is a promising way to extend the semantic web from web pages. However, it is unclear how RE can deal with the several challenges of web pages such as noise, data sparsity and conflicting information. In this paper, we benchmark state-of-the-art RE approaches on the particular case of company web pages, since company web pages are important source of information for Fin-tech and BusinnessTech. To this end, we present a method to build a corpus mimicking web pages characteristics. This corpus was used to evaluate several deep learning RE models and compared to another benchmark corpus.

Mots clés

relation extraction NLP linked data Deep Learning

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

DeepOntoNLP_Wouter(2).pdf (120.99 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

François Portet : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02941935

Soumis le : jeudi 17 septembre 2020-14:24:24

Dernière modification le : jeudi 4 avril 2024-21:29:28

Archivage à long terme le : jeudi 3 décembre 2020-09:43:04

Dates et versions

hal-02941935 , version 1 (17-09-2020)

Identifiants

HAL Id : hal-02941935 , version 1

Citer

Wouter Baes, François Portet, Hamid Mirisaee, Cyril Labbé. RDF triples extraction from company web pages: comparison of state-of-the-art Deep Models. 1st International Workshop Deep Learning meets Ontologies and Natural Language Processing, Sep 2020, Bozen-Bolzano, Italy. ⟨hal-02941935⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG LIG_GLSI_SIGMA LIG_TDCGE_GETALP LIG_SIDCH

200 Consultations

1013 Téléchargements

RDF triples extraction from company web pages: comparison of state-of-the-art Deep Models

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager