Computational Research Integrity Conference
CRI-CONF 2021

March 23-25, 2021

Center for Strategic and International Studies, Washington, DC

Online

About CRI-CONF

About the Computational Research Integrity Conference (CRI-CONF)

This conference will bring researchers from Biomedical Sciences and Ethics as well as Computer Scientists, AI researchers, and statisticians to discuss how research integrity investigations can be made faster, more accurate, and systematic with the use of computational methods

Any of the following topics will be discussed in this conference:

  • Computational or best-practices methods for detecting fabrication, falsification, or plagiarism of text, images, statistics, or other research outcomes
  • The role of research integrity offices at the institutional (e.g., ARIO member) and funders' levels (e.g., ORI)
  • The role of publishers and whistleblowers' websites (e.g., PubPeer)
  • Ethical dimensions of automating research integrity
  • Case studies
  • Any other research broadly related to non-computational or computational research integrity

Where

Online

When

Tuesday to Thursday
23-25 March, 2021

Invited Speakers

(in alphabetical order)

Daniel Acuna

Daniel Acuna (organizer)

Syracuse University

Boris Barbour

Boris Barbour

The PubPeer Foundation

Thorsten Beck

Thorsten Beck

HEADT Centre - Humboldt University of Berlin

Elisabeth Bik

Elisabeth Bik

Harbers-Bik LLC

Jennifer Byrne

Jennifer Byrne

The University of Sydney

Edward Delp

Edward J. Delp

Purdue University

Michèle Nuijten

Michèle Nuijten

Tilburg University

Ivan Oransky

Ivan Oransky

Retraction Watch

Lauran Qualkenbush

Lauran Qualkenbush

Northwestern University

Corinna Raimondo

Corinna Raimondo

Northwestern University

Walter Scheirer

Walter Scheirer

University of Notre Dame

Debora Weber-Wulff

Debora Weber-Wulff

HTW Berlin - University of Applied Sciences

Panelists

(in alphabetical order)

IJsbrand Jan Aalbersberg (Scopus), Erica Boxheimer (EMBO), Paul Brookes (University of Rochester), Jana Christopher (Image-Integrity), Renee Hoch (PLOS One), Wanda Jones (ORI), Stephanie Lee (Buzzfeed), Benyamin Margolis (ORI), Bernd Pulverer (EMBO), Maria Kowalczuk (Springer Nature), Amit K. Roy-Chowdhury (UC Riverside), William C. Trenkle (USDA), Richard Van Noorden (Nature), Wouter Vandevelde (KU Leuven), Mary Walsh (Harvard University)

Event Schedule

The current event schedule is subject to changes

All times in New York, US time.

Welcome & remarks

Ranjini Ambalavanar

Speaker Ranjini Ambalavanar, ORI

Examining Questioned Data — Detection Tools and Need for Automation
Research misconduct in the federal regulation (42 C.F.R. Part 93) means fabrication, falsification, or plagiarism (FFP) in proposing, performing, or reviewing research, or in reporting research results (§ 93.103). This session will discuss the types of questioned data and tools that currently are used to identify and confirm FFP with special emphasis on the need for additional tools and automation. Examples of different types of falsified/fabricated (FF) data from closed misconduct cases at the Office of Research Integrity (ORI) and forensic tools used by ORI to detect and confirm intentional FF will be presented.
Jennifer Byrne

Speaker Jennifer Byrne, The University of Sydney

Computational research integrity and cancer research: building tools and narratives to improve the health of the research literature
Computational research integrity and cancer research are young and old research fields, respectively, and yet they have much in common. Cancer research involves the discovery of biological features that reliably distinguish cancer cells from normal cells. These features are targeted by drug developers to create cancer therapies that are tested by researchers and then applied by clinicians to patients. Similarly, computational research integrity involves the identification of publication features that reliably depart from established norms or standards. These features inform the creation of automated tools that are then tested and applied by researchers and publishers to manuscripts and papers. Based upon our experience of applying the semi-automated tool Seek & Blastn to the molecular cancer research literature, we will describe how the skill to employ automated literature screening tools needs to be matched by the will to apply these tools and then act upon their results. Beyond developing the skills to apply automated literature screening tools within different user groups, we propose that achieving the necessary willingness to tackle pervasive research integrity problems will require the development of positive narratives that speak to shared aspirations and values.

Break

Lauran Qualkenbush; Corinna Raimondo

Speakers Lauran Qualkenbush & Corinna Raimondo, Northwestern University

Michèle Nuijten

Speaker Michèle Nuijten, Tilburg University

statcheck: a spellchecker for statistics
Half of the psychology papers contain inconsistent statistical results in which the reported p‑value does not match the reported test statistic and degrees of freedom. Most of these inconsistencies seem to be small and inconsequential, but in over 12.5% of the papers there are inconsistencies that might change the statistical conclusion. Such statistical reporting inconsistencies affect the reproducibility and quality of scientific findings.

                We developed the R package “statcheck” and the accompanying web app http://statcheck.io 
                to automatically find these inconsistencies. In my talk, I will briefly explain how statcheck 
                works. I will discuss statcheck’s potential in preventing statistical errors through self-checks 
                and peer review, and discuss how statcheck can be used for meta-research. Furthermore, I will 
                briefly give an overview of current developments, including a statcheck Word add-in and the implementation 
                of Natural Language Processing techniques to expand statcheck’s searching algorithm.
              

Contributed presentations

Edward J. Delp

Speaker Edward J. Delp, Purdue University

A System for Forensic Analysis of Scientific Images
In this talk I will describe a system that we are developing for the forensic analysis of images and other media extracted from a scientific publication. This system uses many modern media forensic methods to examine images and determine if the image has been likely altered or modified. The tools that are available include duplication detection, copy/move detection, provenance analysis and media forensics tools. The current system has methods for extracting images, figures, and captions and maintaining the relative relationships of the figures in a paper. The system has a simple and intuitive web-based user interface, a sophisticated database, and is easily extensible using Docker containers.

              

Panel Institutional investigators

  • Wanda Jones, ORI (panel chair)
  • William C. Trenkle, USDA
  • Wouter Vandevelde, KU Leuven
  • Mary Walsh, Harvard University

Break

Panel Publishers

  • Bernd Pulverer, EMBO (panel chair)
  • Renee Hoch, PLOS One
  • IJsbrand Jan Aalbersberg, Scopus
  • Maria Kowalczuk, Springer Nature

Break

Closing remarks

Remarks

Elisabeth Bik

Speaker Elisabeth Bik, Harbers-Bik LLC

Image duplication detection tools — insights from a human spotter
Despite peer-review and editorial screening, science papers can still contain images or other data of concern. A visual scan of 20,000 papers published in 40 biomedical journals showed that 4% contained inappropriately duplicated images. Papers containing incorrect or even falsified data could lead to wasted time and money spent by other researchers trying to reproduce those results. Thorough image screening before publication would be beneficial for editors, publishers, and readers, and act as a deterrent for fraudulent submissions. There is a great need for high-throughput computational tools to find image duplications and manipulations in scientific manuscripts, and to help detect the growing number of fabricated manuscripts produced by paper mills. Elisabeth Bik will present some case examples, insights, and challenges that she has encountered as a human visual duplication detector.
Debora Weber-Wulff

Speaker Debora Weber-Wulff, HTW Berlin - University of Applied Sciences

Responsible Use of Support Tools for Plagiarism Detection
Many academic institutions are of the opinion that they can simply solve the problem of plagiarism by purchasing the use of so-called plagiarism detection software. But as a recent test of such support tools shows, the systems don't find all plagiarism and will report text overlap that is not plagiarism as if it were. Institutions that rely only on some similarity measure for determining sanctions need to be aware of how meaningless the numbers these systems report are.
In this talk the results of the recent test of support tools for detecting plagiarism conducted by the European Network of Academic Integrity will be presented, followed by a discussion of what constitutes the responsible use of such tools.

Break

Michael Lauer

Speaker Michael Lauer, NIH

Matt Turek

Speaker Matt Turek, DARPA

Contributed presentations

Panel Funders

  • Benyamin Margolis, ORI (panel chair)
  • Michael Lauer, NIH
  • Matt Turek, DARPA

Competition launch

Break

Panel Tool developers

  • Daniel Acuna, Syracuse University
  • Jennifer Byrne, The University of Sydney
  • Amit K. Roy-Chowdhury, UC Riverside

Break

Closing remarks

Remarks

Boris Barbour

Speaker Boris Barbour, The PubPeer Foundation

Walter Scheirer

Speaker Walter Scheirer, University of Notre Dame

Understanding the Provenance of Visual Disinformation Targeting Science
The COVID-19 pandemic has attracted significant attention to scientific matters related to the cause, treatment, and prevention of the disease that has upended our lives. Alarmingly, not all of the information available on the Internet is what it appears to be. Deceptive memes, bogus ads, and fabricated infographics are proliferating, with all threatening to undermine the public's trust in science. Given the vast scale of the problem, an automated capability that can identify new instances of visual disinformation, trace its origin, and ultimately flag it as being problematic is needed. But compared to text, visual content presents unique challenges for media forensics. This talk presents an end-to-end processing pipeline for image provenance analysis, which works at real-world scale. It employs a cutting-edge image filtering solution that is able to find related images, as well as novel techniques for obtaining a provenance graph that expresses how the images, as nodes, are ancestrally connected. Building from provenance analysis, the talk goes on to introduce a scalable automated visual recognition pipeline for discovering meme genres of diverse appearance. This pipeline can ingest meme images from a social network, apply computer vision-based techniques to extract features and index new images into a database, and then organize the memes into related genres. Recent examples of visual disinformation targeting science will be highlighted, including repurposed imagery, parasitic advertising, and pandemic-related memes. Finally, the talk will conclude with thoughts on continued research in this direction.

              

Break

Mario Biagioli

Speaker Mario Biagioli, UCLA

Thorsten Beck

Speaker Thorsten Beck, HEADT Centre - Humboldt University of Berlin

Contributed presentations

Ivan Oransky

Speaker Ivan Oransky, Retraction Watch

From Cancer to COVID-19, Does Science Self-Correct?
Rapid publication of results — particularly on preprint servers — has grown dramatically during the COVID-19 pandemic, and has forced researchers, health care professionals, journalists, and others to grapple with the concept of reliable and actionable information. The pandemic has given rise to more than 80 retractions at the time of this writing. Is that cause for concern? My lens for this talk will be ten years of experience reporting on retractions for Retraction Watch, including creating the world’s most comprehensive database of retractions, with close to 24,000 and counting.

              

Panel Journalists

  • Stephanie Lee, Buzzfeed
  • Ivan Oranski, Retraction Watch
  • Richard Van Noorden, Nature
  • Daniel Acuna, Syracuse University (moderator only)

Break

Panel Investigators/whistleblowers

  • Paul Brookes, University of Rochester (panel chair)
  • Boris Barbour, The PubPeer Foundation
  • Elisabeth Bik, Harbers-Bik LLC
  • Erica Boxheimer, EMBO
  • Jana Christopher, Image-Integrity

Break

Open discussion and next steps

Closing remarks

Sponsors

Main sponsor

This conference is funded by the Office of Research Integrity, Department of Health and Human Services, under grant ORIIR190047

Participate

Registration

The registration fee is $40 and includes access to all sessions.

Eligibility: The egistration is open for researchers, funders, research integrity investigators, and senior leadership who are broad interests in research integrity.

The organizers reserve the right to review registrations for eligibility and decline and refund the registration of those who do not meet the eligibility criteria.

Financial support and registration waivers: We have registration waivers available. This will be awarded based on need. Please email computationalresearchintegrity@gmail.com

Submit extended abstract

We invite you to submit research on a broad set of topics on or related to research integrity. Submissions traditionally considered to conferences such as the World Conference on Research Integrity are welcomed too.

We especially encourage submissions on the following topics (but not limited to):

  • Computational or best-practices methods for detecting fabrication, falsification, or plagiarism of text, images, statistics, or other research outcomes
  • The role of research integrity offices at the institutional (e.g., ARIO member) and funders' levels (e.g., ORI)
  • The role of publishers and whistleblowers' websites (e.g., PubPeer)
  • Ethical dimensions of automating research integrity
  • Case studies
  • Any other research broadly related to non-computational or computational research integrity

Important dates:

  • Submission deadline: March 1, 2021
  • Acceptance notification: March 8, 2021
  • Registration deadline for accepted work: March 15, 2021

Media Partners

Retraction Watch
Center for Open Science
Journal of Empirical Research on Human Research Ethics
Science and Engineering Ethics Journal