SEARCHING BIOTECHNOLOGY INFORMATION IN THE 2010s:
Section II (Databases & Search Strategies)
Luca Falciola, IP Manager, Promethera Biosciences
Sardegna Ricerche-Univ. Cagliari (Sep. 15th 2014)
Databases & Biotechnology :
A foreword
 Covering even a limited number of databases
is already pretty impossible and a selection is
required according to a few criteria
 Free access (at most, requiring the registration at a
website using a user/password and an e-mail to get
full access to services ; at this scoep better using a
separate, specific free e-mail on the Internet to be
used only at this scopes and for receiving Table of
Contents, updates etc.)
 Overall positive reputation , importance , and good
« search experience » even for occasional user
 This selection can be easily expanded for
specific objectives by
 Searching the structured repertory in Nucl. Acid Res
. website and the yearly update
 Combining search topic in Google and/or Pubmed
 Exploring he NCBI and EBI websites
2
Sardegna Ricerche
L. Falciola
15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
1
Scientific Literature
- Pubmed
HighWire
- Publishers’ website
2
Patent Literature
3
Chemical Structure & Biological Sequences
4
Metabases
Sardegna Ricerche
L. Falciola
15/09/2014
Scientific Literature:
Introduction
 The databases of scientific literature are several, mostly thematic, and Pubmed
has a major role in life sciences
 More than 1,2 entries for year 2013, for a total of more than 25 millions
 Considered by many as the most complete database
 This leadership should not forget other resources that, at different levels, may
be competitive for identifying relevant literature
 Commercial ones (EMBASE, SCISEARCH,SCI VERSE, BIOSIS, etc.)
 Databases covering a large panel of publishers for promoting the purchase of articles that
provide full-text search features or other advances search / push services
 The full-text Vs indexing/completeness comparison is actually a main topic
4
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
Introduction
 Pubmed offers almost everything you need with exception of full-text search
 A well organized help page including links to Youtube and other tutorials
 Access to the large panel of services of NCBI as summarized in this guide and this NAR paper
 Sign-in page for accessing a even larger panel of features
 Guides to other literature databases , NCBI digital library, and the MeSH system
5
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
Advanced Search and Search History
 Both features are un the same page that can be maintained or even saved
6
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
Field Search
 A large number of fields is available for text or numeric searches
7
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
MeSH Examples (antibodies)
8
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
MeSH tutorial
 Some Pubmed tutorial are too complex and some university provide
simplified versions like for MeSH and insisting in
 pursuing a sequential structured approach to identify the more relevant MeSH terms
 Not forgetting that MeSH are not always present and are relevant for extract a more
relevant subset of references to explore with a series of related criteria
9
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
Search Operators
 A large number of operators /symbols expand the possibilities well beyond
AND OR NOT (and truncation, double quotes are essential for pursuing
precise but not too extensive searches)
 The search can be also improved by the large selection of filters in left sidebar
10
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
Heterogeneity
 An important issue is that Pubmed is intended to provide publications as soon
to users, explaining some heterogeneity in indexing and access to articles
11
Sardegna Ricerche
L. Falciola
15/09/2014
Pubmed:
Some tricks
 Substitute “” with a – between two words in a phrase
 The use of truncation shows how many spelling errors are present in the
database that may make you miss some relevant hits
 Sedn Pubmed reference by e-mail just by indicating the PMID after
http://www.ncbi.nlm.nih.gov/pubmed/
eg http://www.ncbi.nlm.nih.gov/pubmed/25031662,25000062
 The « Related » references can be saved in the search history and combined
with keywords to search within them
 Search History is limited in time and length (better not exceeding 50-80 entries)
12
Sardegna Ricerche
L. Falciola
15/09/2014
Highwire Press:
Overview
 Large literature life science database hosted by Stanford Univ. aggregating
journals from many major publishers but also books and conference
abstracts, also as full text and with some useful filters
13
Sardegna Ricerche
L. Falciola
15/09/2014
Highwire:
Help Page
14
Sardegna Ricerche
L. Falciola
15/09/2014
Highwire:
Search Results & History
15
Sardegna Ricerche
L. Falciola
15/09/2014
Highwire:
Services
 Preview of keywords in the context, alerting for new articles including a given
citation or keywords, alternative viewing features, links to supplementary/
free documents and management of ToC are well implemented
16
Sardegna Ricerche
L. Falciola
15/09/2014
Publishers’ Website:
Introduction
 All main publishers with a large panel of
journals have nice feature to keep track of
new articles or searching heir publications
 Nature, Science, Wiley, Springer
 Scienedirect of Elsevier is particularly rich of
functions and has a broad coverage (even of
journals not indexed in Pubmed
17
Sardegna Ricerche
L. Falciola
15/09/2014
Publishers’ Website:
Other Examples
 Wiley
18
Sardegna Ricerche
L. Falciola
15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
1
2
Scientific Literature
Patent Literature
- Lens
- Espacenet
- Patentscope
3
Chemical Structure & Biological Sequences
4
Metabases
Sardegna Ricerche
L. Falciola
15/09/2014
Patent Literature:
Introduction
 Patent information that may be relevant for a biotech search is available in a
variety of formats:
 Text-based
 Biological sequences
 Chemical structures
 Regular review of patent publications can be performed by using
appropriately three types of tools :
 Multi-Patent offices websites (Patentscope, Espacenet, Lens)
 Patent office-specific tools (at USPTO, EPO, Australian, Indian, etc.) but in general poorly
implemented outside basic number or proceedings
 Access for sequence- or structure-based searches (Lens, EBI)
 Each approach and tool has own strengths/weakness:
 Need to compare/double-check
 Access to PDF and identification of keyword context
20
Sardegna Ricerche
L. Falciola
15/09/2014
Patent Literature:
Overview
 Main strengths:
 Patentscope and Lens: full text/stemmed/nested searches, large number of criteria, login for
saving search strategies, graphical/automated grouping of results
 Patentscope and Espacenet: machine-based translation
 Lens: somehow easier to use for both searching and getting/sending links to PDF files, nice
support section, possible to search only granted patents, nice sorting/filtering functions, claims
and abstract on the same page
 Espacenet: Cooperative Patent Classification & citing/cited documents features for(non-) EP
appl. , link to EPO register, links to (often) reliable patent family & Inpadoc/status information
 Main weaknesses:
 Patentscope: unstability in case of long search session, IPC only, no clear patent family
information
 Lens: format inconsistency for code/number fields, coverage and patent family definition, with
functions appearing and disappearing (now providing IPC and USPC)
 Espacenet: somehow old-style for both searching documents and getting PDF files
 In general:
 No visibility on actual coverage for all collections
 Limited means to identify keyword context
21
Sardegna Ricerche
L. Falciola
15/09/2014
Lens :
Search window
22
Sardegna Ricerche
L. Falciola
15/09/2014
Lens:
Search Window
23
Sardegna Ricerche
L. Falciola
15/09/2014
Lens:
Search Window
24
Sardegna Ricerche
L. Falciola
15/09/2014
Lens:
Filtering Features
25
Sardegna Ricerche
L. Falciola
15/09/2014
Lens:
Help Page
26
Sardegna Ricerche
L. Falciola
15/09/2014
Espacenet:
Search Window and Criteria
27
Sardegna Ricerche
L. Falciola
15/09/2014
Espacenet:
Patent Kind Codes & Help
28
Sardegna Ricerche
L. Falciola
15/09/2014
Espacenet:
CPC Classification
29
Sardegna Ricerche
L. Falciola
15/09/2014
Espacenet:
Results & Record View
30
Sardegna Ricerche
L. Falciola
15/09/2014
Patentscope:
Search Window & Results
31
Sardegna Ricerche
L. Falciola
15/09/2014
Patentscope:
Record & Records Analysis
32
Sardegna Ricerche
L. Falciola
15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
1
Scientific Literature
2
Patent Literature
3
4
Chemical Structure & Biological Sequences
-Uniprot
- EBI-Fasta
- ChEMBL/Pubchem
Metabases
Sardegna Ricerche
L. Falciola
15/09/2014
Uniprot:
Overview & Search Criteria
34
Sardegna Ricerche
L. Falciola
15/09/2014
Uniprot:
Overview & Search Criteria
35
Sardegna Ricerche
L. Falciola
15/09/2014
Uniprot:
HBB in Genecards Vs Uniprot
36
Sardegna Ricerche
L. Falciola
15/09/2014
EBI-Fasta:
Search Window
37
Sardegna Ricerche
L. Falciola
15/09/2014
EBI-Fasta:
Overview of Results
38
Sardegna Ricerche
L. Falciola
15/09/2014
EBI-Fasta:
Patent Sequence Record
39
Sardegna Ricerche
L. Falciola
15/09/2014
ChEMBL:
Introduction
 Medicinal chemistry data/products is now more accessible also to nonspecialist through portals such as EBI/ChEMBL, PubChem, or Drugbank portals
that aggregate and make them searchable through different criteria, across
biological/medical/patent information together with chemical information
from proprietary repositories) for creating Molecular Clouds
(Ertl and Rohde, J Cheminf 2012)
40
Sardegna Ricerche
L. Falciola
15/09/2014
ChEMBL:
Features
41
Sardegna Ricerche
L. Falciola
15/09/2014
ChEMBL:
Search & Browse Features
42
Sardegna Ricerche
L. Falciola
15/09/2014
ChEMBL:
Targets, Ligands & Drug Approvals
43
Sardegna Ricerche
L. Falciola
15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
1
Scientific Literature
2
Patent Literature
3
Chemical Structure & Biological Sequences
4
Metabases
- Google
Sardegna Ricerche
- Google Scholar
L. Falciola
- Drugbank
15/09/2014
Google:
Advanced Search & GoogleGuide
45
Sardegna Ricerche
L. Falciola
15/09/2014
Google Scholar:
Introduction
 This site claims having broad coverage of both scientific and patent literature
but it is actually unclear the coverage:
 beyond US patent documents and by which date (they index papers and not journals
 of which publishers
 The system has some additional useful features compared to “pure“ Google
 Separate advanced search features
 Management of alerts through own Gmail account
 Import features for reference management systems (but not always precise)
 Selection of publication date instead of appearance on the web (but again not always precise)
 Clear link to PDF on the left side of the window
 Citation list (that can be searched separately) and “related articles” features
 Metrics / search by journal
 Focused help page with advis on how getting your paper indexed
46
Sardegna Ricerche
L. Falciola
15/09/2014
Google Scholar:
Advanced Search Features
47
Sardegna Ricerche
L. Falciola
15/09/2014
Google Scholar:
Settings and Metrics Features
48
Sardegna Ricerche
L. Falciola
15/09/2014
Google Scholar:
Final Comments
 Google Scholar provides means for overcoming only some limitations of
“pure” Google
 Lack of visibility about publication/journal coverage
 Unstructured search features within documents
 Lack of indexing
 It is an interesting tool for exploratory searches or completing searches made
in “traditional” databases
 Exploiting full-text and advanced search features in a more structured environment
 Linking articles to combinations of specific technical details, cross-references, authors
 Obtaining additional search criteria to be used elsewhere
49
Sardegna Ricerche
L. Falciola
15/09/2014
DrugBank:
Introduction
50
Sardegna Ricerche
L. Falciola
15/09/2014
DrugBank:
Results
51
Sardegna Ricerche
L. Falciola
15/09/2014
DrugBank:
Records
52
Sardegna Ricerche
L. Falciola
15/09/2014
Thank you !!
[email protected]
The views and the opinions expressed in this presentation are the
author’s personal thoughts on these subjects. They are not intended
to be considered opinions and positions of Promethera, nor imply any
commitment by Promethera to any particular action.
53
Scarica

Pubmed - P-arch