Polo Bibliotecario di Scienze, Farmacologia e Scienze Farmaceutiche
Scuola di Dottorato di Ricerca in Scienze Molecolari
Open research data
Scuola di Dottorato di Ricerca in
Scienze Molecolari
Information literacy in Chemistry
October 10, 2014
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Emanuela Casson
http://www.parse-insight.eu/downloads/PARSE-Insight_D3-4_SurveyReport_final_hq.pdf
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
http://www.parse-insight.eu/downloads/PARSE-Insight_D3-4_SurveyReport_final_hq.pdf
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
The current practice is not be
explained by a disinterest of
researchers in other people’s data.
63% of the researchers, who do not
currently make use of other
researchers’ data, would like to do
so in the (near) future.
But
When asked whether they ever truly
needed digital research data by other
researchers that was, for whatever
reason, not available, 53% of the
respondents answered yes.
http://www.parse-insight.eu/downloads/PARSE-Insight_D3-4_SurveyReport_final_hq.pdf
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Open data: definition(s)
n 
Knowledge is open if anyone is free to access, use, modify,
and share it - subject, at most, to measures that preserve
provenance and openness http://opendefinition.org/od/
A piece of data is open if anyone is free to use, reuse, and
redistribute it - subject only, at most, to the requirement to
attribute and/or share-alike
http://en.wikipedia.org/wiki/Open_data
n 
n 
By open data in science we mean that it is freely available on
the public internet permitting any user to download, copy,
analyse, re-process, pass them to software or use them for
any other purpose without financial, legal, or technical barriers
pantonprinciples.org/
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Open research data - what
n 
Data may be raw or primary data (eg direct from
measurement), or derivative of primary data, but are
not yet the product of analysis or interpretation other
than calculation
https://royalsociety.org/~/media/policy/projects/sape/
2012-06-20-saoe.pdf)
Factual data
n  Data “observational, computational, or experimental ”
n  Graphs and tables
but there is a grey area between creative text and factual
data
n 
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Open data - how
n 
availability and access
n 
reuse and redistribution
n 
interoperability
n 
universal participation
n 
metadata
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Open data - why
Reasons for open data:
n  Moral: closed can be unjust
n  Ethical: community norms expect it
n  Utilitarian: greater communal good
n  Personal: greater pesonal benefit
n  Most scientific data is lost; costs many
billions
… mandates?
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Panton Principles
n 
n 
Panton Principles, Principles for open data in
science. Murray-Rust, Peter; Neylon, Cameron;
Pollock, Rufus; Wilbanks, John; (19 Feb 2010).
http://pantonprinciples.org
The Panton Principles are a set of
recommendations that address how best to
make published data from scientific studies
available for re-use.
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
OD - Licenses
n 
Use a recognized waiver or license that is
appropriate for data es. CC0 and PDDL. Noncommercial and other restrictive clauses should
not be used
CC0 – Creative Commons zero
http://creativecommons.org/publicdomain/zero/
1.0/
n  PDDL - Public Domain Dedication and License
http://opendatacommons.org/licenses/pddl/1.0/
n 
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
OD for data users
n 
n 
How do I know that data is Open Data?
Are there any restrictions on what I can do
with Open Data?
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Figshare and Dryad
n 
http://figshare.com
Figshare is an online digital repository where researchers can
preserve and share their research outputs, including figures,
datasets, images, and videos
n 
http://datadryad.org
Dryad hosts research data underlying scientific and medical
publications. Historically, the repository has been strongest in
the life sciences
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Life sciences: GenBank and H5N1
n 
GenBank ® is the NIH genetic sequence
database, an annotated collection of all
publicly available DNA sequences (Nucleic
Acids Research, 2013 Jan;41(D1):D36-42).
http://www.ncbi.nlm.nih.gov/genbank/
Ilaria Capua and H5N1
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Molecular & supramolecular structure
n  Protein
Circular Dichroism Data Bank (PCDDB)
http://www.biosharing.org
n  Crystallography Open Database (COD)
http://www.crystallography.net
n  Coherent X-ray Imaging Data Bank (CXIDB)
http://www.cxidb.org
n  Biological Magnetic Resonance Data Bank (BMRB)
http://www.bmrb.wisc.edu
n  Electron Microscopy Data Bank (EMDB)
http://www.emdatabank.org
n  Worldwide Protein Data Bank (wwPDB)
http://wwpdb.org
Sistema Bibliotecario di Ateneo | Università di Padova
Crystallography open database
http:/
www.crystallography.net
n Open-access
collection of crystal structures of
organic, inorganic, metal-organic compounds
and minerals, excluding biopolymers
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Chemistry and chemical biology
n 
n 
n 
caNanoLab
https://cananolab.nci.nih.gov/caNanoLab/
This is a web-based application designed to facilitate data sharing in the
research community to expedite and validate the use of nanomaterials
in biomedicine
PubChem
http://pubchem.ncbi.nlm.nih.gov
PubChem is a freely accessible database (FTP) that provides
information about small molecules.
ChemSpider
http://www.chemspider.com/
ChemSpider is a free chemical structure database providing fast access
to over 32 million structures, properties, and associated information
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
EBI - European Bioinformatics Institute
n 
http://www.ebi.ac.uk/services
n 
ChEBI : the database and ontology of
Chemical Entities of Biological Interest
http://www.ebi.ac.uk/chebi/
n 
ChEMBL - Bioactive data for drug discovery
https://www.ebi.ac.uk/chembl/
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Others
http://www.nature.com/sdata/data-policies/
repositories
Sistema Bibliotecario di Ateneo | Università di Padova
Emanuela Casson
Scarica

slide - Polo di Scienze