WP4
Analysis of non-EBRCN databases and
network services of interest to BRCs
Current status
Paolo Romano
EBRCN General Meeting, Paris, 28-29/11/2002
1
WP4: databases of interest
Short delay: 1 month ca.
· Definition of a list of databases and
services that could be of interest to BRCs
 done
· Selection of a subsets of those
databases and services  done
EBRCN General Meeting, Paris, 28-29/11/2002
2
WP4: identifiers and methods
· Selection of information of interest to
BRCs within selected databases
 ongoing,  done for Medline & EMBL
· Analysis of identifiers and information
and of methods for linking  ongoing,
 done for Medline
EBRCN General Meeting, Paris, 28-29/11/2002
3
WP4: Pubmed IDs
· CABRI catalogue production guidelines
update  ongoing,  done for Literature
in animal and human cells
· Retrieval of needed PUBMED IDs for
linking  ongoing,  done for ICLC,
BCCM/LMBP, NCCB plasmids, support
from DSMZ (Kracht) and BCCM
(Guissart)
EBRCN General Meeting, Paris, 28-29/11/2002
4
WP4: structure and syntax
· Catalogue structures update  ongoing,
 done for Literature in animal and
human cells
· SRS structure and syntax files
 ongoing, depending on deadlines for
submission of catalogues,  done for
ICLC
EBRCN General Meeting, Paris, 28-29/11/2002
5
WP4: catalogues updates
Catalogues updates:
 done ICLC: November 2002
 Plasmids and cell lines: January 2003
 “Other catalogues”: February 2003
 Bacteria: March 2003
 Fungi and Yeasts: May 2003
EBRCN General Meeting, Paris, 28-29/11/2002
6
WP4: EMBL links
• EMBL Data Library is the European database for
DNA sequences
• It is updated daily and a coordination with NCBI
and DDBJ ensures its completeness
• It is offered at EBI by means of SRS
EBRCN General Meeting, Paris, 28-29/11/2002
7
WP4: EMBL links
• Test have been conducted to identify how to link
to EMBL Data Library through SRS, without IDs
• Tests performed on:
•
•
•
•
•
Bacteria and Archaea
Animal and Human Cell Lines
Fungi and Yeasts
Plasmids
Viruses
EBRCN General Meeting, Paris, 28-29/11/2002
8
WP4: EMBL links variability
• Links are different for different materials
• Links can use various EMBL fields:
•
•
•
•
All-text (not very useful)
Organism (for micro-organisms)
Division (useful for viruses and plasmids)
Feature Table data (allow for a correct definition of a
source through Key, Qualifier, Description)
EBRCN General Meeting, Paris, 28-29/11/2002
9
WP4: EMBL links variability
• Example search: CBS 100.20 in CBS_FIL
• Fields and values:
•
•
•
•
Organism: fungi
Ft-Key: source
Ft-Qualifier: strain
Ft-Description: "cbs 100.20"
EBRCN General Meeting, Paris, 28-29/11/2002
10
WP4: EMBL links variability
• Annotation problems:
• CBS 100.20 can be annotated as CBS 100.20 or
CBS100.20
• CBS 112345 can be annotated as CBS12345
• Indexing problems:
• CBS 100.20 is indexed as CBS, 100 and 20
• The dot is not included and is used as a space
EBRCN General Meeting, Paris, 28-29/11/2002
11
WP4: EMBL links variability
Examples of searches:
• Query: Bacteria & source & cip*
( ([emblrelease-FtKey:source] &
[emblrelease-FtQualifier:strain] &
[emblrelease-FtDescription:cip*])
< [emblrelease-Organism:bacteria*] )
• Query: Cell line & source & dsm*
( ([emblrelease-FtKey:source] &
[emblrelease-FtQualifier:cell_line] &
[emblrelease-FtDescription:dsm*])
< [emblrelease-Organism:mammalia*] )
EBRCN General Meeting, Paris, 28-29/11/2002
12
WP4: EMBL links variability
Examples of search:
• Query: Bacteria & source & cbs 100.20
( ( ([emblrelease-FtKey:source] &
[emblrelease-FtQualifier:strain] &
( ( [emblrelease-FtDescription:cbs] &
[emblrelease-FtDescription:100] ) |
[emblrelease-FtDescription:cbs100] ) &
[emblrelease-FtDescription:20]) )
< [emblrelease-Organism:fungi*] )
EBRCN General Meeting, Paris, 28-29/11/2002
13
WP4: extracted databases
Extracted databases
• Selection of a meaningful subset of information
(strain identification) for each material, including
links to external dbs/services  ongoing,
proposal sent to collections next month
EBRCN General Meeting, Paris, 28-29/11/2002
14
Scarica

WP4Paris2002