User Tools

Site Tools


l3sintern:research_seminar_10

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
l3sintern:research_seminar_10 [2011/01/04 14:51]
denecke
l3sintern:research_seminar_10 [2011/01/13 12:29] (current)
siberski
Line 631: Line 631:
 **organized by: ** Dimitris ​ **organized by: ** Dimitris ​
  
-Speakers: Julien, Marco+Speakers: ​Dimitris, ​Julien, Marco
  
-**Topic(s)**+==== Topic(s)==== 
 + 
 +=== Efficient Discovery of Frequent Subgraph Patterns in Uncertain Graph Databases (Dimitris) === 
 + 
 +Mining frequent subgraph patterns in graph databases is a challenging and important problem with applications in several domains. Recently, there is a growing interest in generalizing the problem to uncertain graphs, which can model the inherent uncertainty in the data of many applications. The main difficulty in solving this problem results from the large number of candidate subgraph patterns to be examined and the large number of subgraph isomorphism tests required to find the graphs that contain a given pattern. The latter becomes even more challenging,​ when dealing with uncertain graphs. In this paper, we propose a method that uses an index of the uncertain graph database to reduce the number of comparisons needed to find frequent subgraph patterns. The proposed algorithm relies on the apriori property for enumerating candidate subgraph patterns efficiently. Then, the index is used to reduce the number of comparisons required for computing the expected support of each candidate pattern. It also enables additional optimizations with respect to scheduling and early termination,​ that further increase the efficiency of the method. The evaluation of our approach on three real-world datasets as well as on synthetic uncertain graph databases demonstrates the significant cost savings with respect to the state-of-the-art approach. 
 + 
 + 
 +=== Time-Aware Entity-Based Multi-Document Summarisation (Julien) === 
 +Automatic news multi-document summarisation received increased intention lately to 
 +cope with the increasing amount of news articles and sources. Summarisation of  
 +news article has the additional challenge that document (news articles) are timestamped,​ 
 +and often relate events which themselves inscribe in time 
 + 
 +We propose three contributions which we believe will help improving summarisation quality: 
 +  - Considering named entities in news article 
 +  - Considering time for summarisation and for summary layout 
 +  - Considering time references in the text in addition to article timestamps 
 + 
 +For this we augment a state-of-the-art summarisation technique with named entities and 
 +time references, and adapt a state-of-the-art news event detection to cluster sentences 
 +to improve summarisation of news article. 
 + 
 +This work is in progress, and I will present the general approach and ideas, as well as 
 +the current status of the work. 
 + 
 +=== Detecting Health Events on the Social Web to Enable Epidemic Intelligence (Marco) ===  
 + 
 +Content analysis and clustering of natural language documents becomes  
 +crucial in various domains, even in public health. Recent pandemics such as Swine  
 +Flu have caused concern for public health officials.  
 +Given the ever increasing pace at which infectious diseases can spread globally,  
 +Officials must be prepared to react sooner and with greater epidemic 
 +intelligence gathering capabilities. There is a need to allow 
 +for information gathering from a broader range of sources, 
 +including the Web which in turn requires more robust processing 
 +capabilities. To address this limitation, in this paper, 
 +we propose a new approach to detect public health events 
 +in an unsupervised manner. We address the problems associated 
 +with adapting an unsupervised learner to the medical 
 +domain and in doing so, propose an approach which 
 +combines aspects from different feature-based event detection 
 +methods. We evaluate our approach with a real world 
 +dataset with respect to the quality of article clusters. Our 
 +results show that we are able to achieve a precision of 62% 
 +and a recall of 75% evaluated using manually annotated,​ 
 +real-world data.
  
  
l3sintern/research_seminar_10.1294152696.txt.gz · Last modified: 2011/01/04 14:51 by denecke