supported by the EC IST Programme
CoreGRID European Research Network on Foundations, Software Infrastructures and Applications
for large scale distributed, GRID and Peer-to-Peer Technologies
image
Home arrow Institutes arrow Institute on KDM  
Friday, 05 September 2008

spacer spacer
 

The CoreGRID Network of Excellence currently offers

ident Fellowships:
   for postgraduate students in the field of GRID Research

ident Job announcements:
   related to GRID research free of charge 

Main Menu
Home
News
Events
About CoreGRID
Institutes
Integration Activities
Dissemination
Training & Education
CoreGRID & Industry
Mobility Portal
Trust&Security Portal
Collaboration Gateway
Other Collaborations
Links
Contact Us
Login Form





Lost Password?
Who's Online
Visitors: 2359016
 
spacer spacer
spacer spacer
 
Institute on Knowledge and Data Management
Research Objectives Print

 Institute leader: Domenico Talia (taliadeis.unical.it), University of Calabria        
 Image

 
The overall objective of this Institute is to further the integration of data management and knowledge discovery with GRID technologies for providing knowledge-based GRID for the Semantic GRID and the Knowldege GRID. The institute provides a collaborative setting of European research teams working on: distributed storage management on GRIDs; development knowledge techniques and tools for supporting data intensive applications; and the integration of data and computation GRIDs with information and knowledge GRIDs. The goal is to strengthen the joint activity of research groups that today have sporadic and partial collaboration promoting larger leading teams and supporting efforts towards standard models and tools for data and knowledge management on GRIDs and P2P systems.

This Institute has tasks in three main areas:

  • Distributed Data Management   Providing infrastructures, techniques, and policies for managing storage resources in the GRID.
  • Information and Knowledge Management   Developing metadata, semantic representation, and protocols for GRID service discovery, information management and design of designing knowledge-oriented GRID services.
  • Data Mining and Knowledge Discovery   Design of GRID resource semantic mapping, database querying on GRIDs, and services and for distributed data mining and knowledge discovery on GRIDs.

Roadmap version 3 on Knowledge and Data Management

Publications related to the Institute on Knowledge and Data Management
 
Research Groups Print

Research Group Leader
Participants 
Grid Data Storage Access and Management Architecture FORTH
FORTH, PSNC, SZTAKI, UCY, INFN, UNL
Storage Security INFN INFN, FORTH, STFC
GRID Data Integration Models and Architectures UoM  UNICAL, UoM
Methods for Deriving GRID Trust and Security Policies for Managing VOs
CETIC
CETIC, STFC
Distributed Data Mining in GRIDs and P2P Systems
UNICAL
UNICAL, ISTI-CNR, UCY, ICAR-CNR
Adaptivity in Distributed Query and Workflow
UNCL
UoM, UNCL
 
Latest Research Highlights Print

On Usage Control in Data Grids

CoreGRID Technical Report TR-0154 [pdf]: 

This paper reasons on usage control in Data Grids. First, we present a usage-based Grid authorization architecture using the functional components of the currents Grids, and consider the advantages of using Semantic Grid techologies for the specification of UCON subjects and objects. Then, we analyse the formal requirements for an enforcing mechanism of UCON policies, using the KAOS requirements engineering methodology with a bottom-up approach. To do it, we provide an abstract specification of an enforcement mechanism. Then, we prove that this specification is sound and complete showing formally that it can enforce all the policies pertaining to the Sandhu’s UCON authorization sub-models. Using the rigorous requirement engineering methodology of KAOS, we derive for each sub-model the operational requirements, showing that each one can be enforced by the specification previously provided.
 Image

A Scalable Architecture for Discovery and Planning in P2P Service Networks

  CoreGRID Technical Report TR-0152 [pdf]:

The desirable global scalability of Grid systems has steered the research towards the employment of the peer-topeer (P2P) paradigm for the development of new resource discovery systems. As Grid systems mature, the requirements for such a mechanism have grown from simply locating the desired service to compose more than one service to achieve a goal. In Semantic Grid, resource discovery systems should also be able to automatically construct any desired service if it is not already present in the system, by using other, already existing services. In this report, we present a novel system for the automatic discovery and composition of services, based on the P2P paradigm, having in mind (but not limited to) a Grid environment for the application. The report improves composition and discovery by exploiting a novel network partitioning scheme for the decoupling of services that belong to different domains and an ant-inspired algorithm that places co-used services in neighbouring peers.
Image

A data-centric security analysis of ICGrid

CoreGRID Technical Report TR-0145 [pdf]: 

The Data Grid is becoming a new paradigm for eHealth systems due to its enormous storage potential using decentralized resources managed by different organizations. The storage capabilities in these novel “Health Grids” are quite suitable for the requirements of systems like ICGrid, which captures, stores and manages data and metadata from Intensive Care Units. However, this paradigm depends on a widely distributed storage sites, therefore requiring new security mechanisms, able to avoid potential leaks to cope with modification and destruction of stored data under the presence of external or internal attacks. Particular emphasis must be put on the patient’s personal data, the protection of which is required by legislations in many countries of the European Union and the world in general. Taking into consideration underlying data protection legislations and technological data privacy mechanisms, in this paper we identify the security issues related with ICGrid’s data and metadata after applying an analysis framework extended from our previous research on the Data Grid’s storage services. Then, we present a privacy protocol that demonstrates the use of two basic approaches (encryption and fragmentation) to protect patients’ private data stored using the ICGrid system.
 Image

Providing security to the Desktop Data Grid

  CoreGRID Technical Report TR-0144 [pdf]:

Volunteer Computing is becoming a new paradigm not only for the Computational Grid, but also for institutions using production-level Data Grids because of the enormous storage potential that may be achieved at a low cost by using commodity hardware within their own computing premises. However, this novel “Desktop Data Grid” depends on a set of widely distributed and untrusted storage nodes, therefore offering no guarantees about neither availability nor protection to the stored data. These security challenges must be carefully managed before fully deploying Desktop Data Grids in sensitive environments (such as eHealth) to cope with a broad range of storage needs, including backup and caching.
In this paper we propose a cryptographic protocol able to fulfil the storage security requirements related with a generic Desktop Data Grid scenario, which were identified after applying an analysis framework extended from our previous research on the Data Grid’s storage services. The proposed protocol uses three basic mechanisms to accomplish its goal: (a) symmetric cryptography and hashing, (b) an Information Dispersal Algorithm and the novel (c) “Quality of Security” (QoSec) quantitative metric. Although the focus of this work is the associated protocol, we also present an early evaluation using an analytical model. Our results show a strong relationship between the assurance of the data at rest, the QoSec of the Volunteer Storage Client and the number of fragments required to rebuild the original file.
Image

Distributed Data Mining in Desktop Grids

CoreGRID Technical Report TR-0141 [pdf]: 

Several kinds of scientific and commercial applications require the execution of a large number of independent tasks. One highly successful and low cost mechanism for acquiring the necessary compute power for these applications is the “public-resource computing”, or “desktop Grid” paradigm, which exploits the computational power of private computers. So far, this paradigm has not been applied to data mining applications for two main reasons. First, it is not trivial to decompose a data mining algorithm into truly independent sub-tasks. Second, the large volume of data involved makes it difficult to handle the communication costs of a parallel paradigm. In this paper, we focus on one of the main data mining problem: the extraction of closed frequent itemsets from transactional databases. We show that is possible to decompose this problem into independent tasks, which however need to share a large volume of data. We thus introduce a data-intensive computing network, which adopts a P2P topology based on super peers with caching capabilities, aiming to support the dissemination of large amounts of information. Finally, we evaluate the execution of our data mining job on such network.
 Image

 
 
spacer spacer
spacer spacer
 
© 2008 CoreGRID Network of Excellence - European Grid Research
 
spacer spacer