Data Publication: The scientific knowledge base of low carbon energy technologies

Hötte K, Pichler A, Lafond F (2020)
Bielefeld University.

Datenpublikation
 
Download
OA 2.22 GB
Creator
Hötte, KerstinUniBi ; Pichler, Anton; Lafond, François
Abstract / Bemerkung
#### Note: #### An updated version of these data including data on biofuels and fuels from waste is available [here](https://pub.uni-bielefeld.de/record/2950291). The extended version also offers a package of R-scripts that have been used to reproduce the statistical analysis presented in [Hötte, Pichler, Lafond (2021): The rise of science in low-carbon energy technologies](https://doi.org/10.1016/j.rser.2020.110654).

This data publication offers data about low-carbon energy technology (LCET) patents and citations links to the scientific literature. This data publication contains different data sets (in .RData and (long-term archivable) .tsv format). Further information about each data set is provided in more detail below. - "all_papers.RData" : Data on scientific papers from Microsoft Academic Graph (MAG), 3 columns: Paper ID, Paper year, cited (binary 0-1, indicates whether the paper is cited by a patent). - "all_patents.RData" : Data on USPTO utility patents, 6 columns: Patent number, Patent year (grant year), CPC class, Patent date, Patent title, citing_to_science (binary 0-1, indicates whether the patent is citing to science). - "LCET_patents.RData" : Subset of LCET patents, 6 columns: Patent number, Patent year (grant year), Technology type, CPC class, Patent date, Patent title. - "LCET_patent_citations.RData" : Citations from LCET patents to other patents, 2 columns: citing, cited (Patent numbers). - "LCET_subset_with_metainfo_final.RData" : Citations from LCET patents to scientific papers from MAG, complemented by meta-information on patents and papers, 18 columns: Patent number, Paper ID, Patent year, Paper year, Technology type, WoS field, Patent title, Paper title, DOI, Confidence Score, Citation type, Reference type, Journal/ Conf. name, Journal ID, Conference ID, CPC class, Patent date, US patent. ### License and terms of use ### This data is licensed under the CC BY 4.0 license. See: [https://creativecommons.org/licenses/by/4.0/legalcode](https://creativecommons.org/licenses/by/4.0/legalcode) Please find the full license text below. If you want to use the data, do not forget to give appropriate credit by citing this data publication and the following paper.
Kerstin Hötte, Anton Pichler, François Lafond: *The rise of science in low-carbon energy technologies*, Renewable and Sustainable Energy Reviews, Volume 139, 2021 [https://doi.org/10.1016/j.rser.2020.110654](https://doi.org/10.1016/j.rser.2020.110654) ### LCET definition and concepts ### LCET are defined by Cooperative Patent Classification (CPC) codes. CPC offers "tags" that are assigned to patents that are useful for the adaptation and mitigation of climate change. LCET are identified by YO2E codes, i.e. that are assigned to technologies that contribute to the "REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION". Only the subset of Y02E01 ("Energy generation through renewable energy sources") and Y02E03 ("Energy generation of nuclear origin") technologies are used. 8 different LCET are distinguished: Solar PV, Wind, Solar thermal, Ocean power, Hydroelectric, Geothermal, Nuclear fission and Nuclear fusion. More information about the Y02-tags can be found in: Veefkind, Victor, et al. "A new EPO classification scheme for climate change mitigation technologies." World Patent Information 34.2 (2012): 106-111. DOI: [https://doi.org/10.1016/j.wpi.2011.12.004](https://doi.org/10.1016/j.wpi.2011.12.004) ### Data sources and compilation ### The data was generated by the merge of different data sets. 1.) Patent data from USPTO was downloaded here: https://bulkdata.uspto.gov/ 2.) Complementary data on grant year and patent title was taken from: https://cloud.google.com/blog/products/gcp/google-patents-public-datasets-connecting-public-paid-and-private-patent-data 3.) Citations to science come from the Reliance on Science (RoS) data set https://zenodo.org/record/3685972 (v23, Feb. 24, 2020) DOI: [10.5281/zenodo.3685972](10.5281/zenodo.3685972) The directory ("code") offers the R-scripts that were used to process MAG data and to link it to patent data. The header of the R-scripts offer additional technical information about the subsetting procedures and data retrieval. For more information about the patent data, see: Pichler, A., Lafond, F. & J, F. D. (2020), Technological interdependencies predict innovation dynamics, Working paper pp. 1–33. URL: [https://arxiv.org/abs/2003.00580](https://arxiv.org/abs/2003.00580) For more information about MAG data, see: Marx, Matt, and Aaron Fuegi. "Reliance on science: Worldwide front‐page patent citations to scientific articles." Strategic Management Journal 41.9 (2020): 1572-1594. DOI: [https://doi.org/10.1002/smj.3145](https://doi.org/10.1002/smj.3145) Marx, Matt and Fuegi, Aaron, Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles. Boston University Questrom School of Business Research Paper No. 3331686. DOI: [http://dx.doi.org/10.2139/ssrn.3331686 ](http://dx.doi.org/10.2139/ssrn.3331686 ) ### Detailed information about the data ### - "all_papers.RData" : Data on scientific papers from Microsoft Academic Graph (MAG), 3 columns: Paper ID: Unique paper-identifier used by MAG Paper year: Year of publication cited: binary 0-1, indicates whether the paper is cited by a patent, citation links are made in the text body and front-page of the patent, and added by examiners and applicants. - "all_patents.RData" : Data on USPTO utility patents, 6 columns: Patent number: Number given by USPTO. Can be used for manual patent search in http://patft.uspto.gov/netahtml/PTO/srchnum.htm (numeric) Patent year: Year when the patent was granted (numeric) CPC class: Detailed 8-digit CPC code (numeric) Patent date: Exact date of patent granting (numeric) Patent title: Short title (character) citing_to_science: binary 0-1, indicates whether the patent is citing to science as identified by citation links in RoS. (numeric) - "LCET_patents.RData" : Subset of LCET patents, 6 columns: Patent number: (numeric) Patent year: (numeric) Technology type: Short code used to tag 8 different types of LCET (pv, (nuclear) fission, (solar) thermal, (nuclear) fusion, wind, geo(termal), sea (ocean power), hydro) (character) CPC class: Detailed 8-digit CPC code (character) Patent date: (numeric) Patent title: (numeric) - "LCET_patent_citations.RData" : Citations from LCET patents to other patents, 2 columns: citing: Number of citing patent (numeric) cited: Number of cited patent (numeric) - "LCET_subset_with_metainfo_final.RData" : Citations from LCET patents to scientific papers from MAG, complemented by meta-information on patents and papers, 18 columns: Patent number: see above (numeric) Paper ID: see above (numeric) Patent year: see above (numeric) Paper year: see above (numeric) Technology type: see above (character) WoS field: Web of Science field of research, WoS fiels were probabilistically assigned to papers and are used as given by RoS (character) Patent title: see above (character) Paper title: Title of scientific article (character) DOI: Paper DOI if available (character) Confidence Score: Reliability score of citation link (numeric). Links were probabilistically assiged. See Marx and Fuegi 2019 for further detail. Citation type: Indicates whether citation made in text body of patent document or its front page (character) Reference type: Examiner or applicant added citation link (or unknown). (character) Journal/ Conf. name: Name of journal or conference proceeding where the cited paper was published (character) Journal ID: Journal identifier in MAG (numeric) Conference ID: Conference identifier in MAG (numeric) CPC class: see above (character) Patent date: see above (numeric) US patent: binary US-patent indicator as provided by RoS (numeric) #### Note: #### The citation links were probabilistically retrieved. During the analysis, we identified manually some false-positives are removed them from the "LCET_subset_with_metainfo_final.RData" data set. The list is available, too: "list_of_false_positives.tsv" We do not claim to have a perfect coverage but expect a precision of >98% as described by Marx and Fuegi 2019. ### Statistics about the data ### Full data set: - Number of papers in MAG: 179,083,029 - Number of all patents: 10,160,667 - Number of citing patents: 2,058,233 - Number of cited papers: 4,404,088 - Number of citation links from patents to papers: 34,959,193 LCET subset: - Number of LCET patents: 57,530 - Number of citing LCET patents: 16,674 - Number of cited papers: 53,509 - Number of citation links from LCET patents to papers: 151,253 - Number of citation links from LCET patents to other patents: 567,274 Meta-information:
Papers: - Publication year, 251 Web-of-Science (WoS) categories, Journal/ conference proceedings name, DOI, Paper title Patents: - Grant year, >250,000 hierarchical CPC classes, 8 LCET types Citation links: - Reference type, citation type, reliability score #### If you have further questions about the data or suggestions, please contact: kerstin.hotte@oxfordmartin.ox.ac.uk ### License issues ### Terms of use of the source data: - Reliance on Science data [https://zenodo.org/record/3685972](https://zenodo.org/record/3685972), Open Data Commons Attribution License (ODC-By) v1.0, https://opendatacommons.org/licenses/by/1.0/ - "Google Patents Public Data” by IFI CLAIMS Patent Services and Google (https://cloud.google.com/blog/products/gcp/google-patents-public-datasets-connecting-public-paid-and-private-patent-data), Creative Commons Attribution 4.0 International License (CC BY 4.0), https://console.cloud.google.com/marketplace/details/google_patents_public_datasets/google-patents-public-data - USPTO patent data (https://bulkdata.uspto.gov/), see: https://bulkdata.uspto.gov/data/2020TermsConditions.docx
Stichworte
Low-carbon energy technology; Energy transition; Patent data; Scientific articles; Microsoft Academic Graph; Citation network; Bibliometrics; Technological change; Technological knowledge; Scientific knowledge
Erscheinungsjahr
2020
Page URI
https://pub.uni-bielefeld.de/record/2941555

Zitieren

Hötte K, Pichler A, Lafond F. Data Publication: The scientific knowledge base of low carbon energy technologies. Bielefeld University; 2020.
Hötte, K., Pichler, A., & Lafond, F. (2020). Data Publication: The scientific knowledge base of low carbon energy technologies. Bielefeld University. doi:10.4119/unibi/2941555
Hötte, Kerstin, Pichler, Anton, and Lafond, François. 2020. Data Publication: The scientific knowledge base of low carbon energy technologies. Bielefeld University.
Hötte, K., Pichler, A., and Lafond, F. (2020). Data Publication: The scientific knowledge base of low carbon energy technologies. Bielefeld University.
Hötte, K., Pichler, A., & Lafond, F., 2020. Data Publication: The scientific knowledge base of low carbon energy technologies, Bielefeld University.
K. Hötte, A. Pichler, and F. Lafond, Data Publication: The scientific knowledge base of low carbon energy technologies, Bielefeld University, 2020.
Hötte, K., Pichler, A., Lafond, F.: Data Publication: The scientific knowledge base of low carbon energy technologies. Bielefeld University (2020).
Hötte, Kerstin, Pichler, Anton, and Lafond, François. Data Publication: The scientific knowledge base of low carbon energy technologies. Bielefeld University, 2020.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung 4.0 International Public License (CC-BY 4.0):
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2020-04-21T11:05:19Z
MD5 Prüfsumme
77c444e1cc964103655ef9d7fc970d5c


Material in PUB:
Spätere Version
In sonstiger Relation
The rise of science in low-carbon energy technologies
Hötte K, Pichler A, Lafond F (2021)
Renewable and Sustainable Energy Reviews 139: 110654.
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar