DL Seminar | Private Companies and Scholarly Infrastructure
  • Digital Life Initiative

DL Seminar | Private Companies and Scholarly Infrastructure

By Xiran Sun | MS Student, Cornell Tech

At the Digital Life Seminar on September 5 2019, DLI Postdoctoral Fellow Jake Goldenfein presented his work on Google Scholar's influence on academic infrastructure. The research has been put a focus on whether it is political and ethical that the infrastructural of the academic sharing platform shifting into a platform founded by a corporate and operating on commercial logic with limited transparency or accountability.

Google Scholar – a central academic infrastructure

As students, when we want to write a paper about a certain topic, Google Scholar is a priority resource for relevant information. For researchers, Google Scholar provides a larger number of recent research or publications than other platforms, making it a strong tool for their research. Google Scholar has become a central academic infrastructure.

However, with this power, Google Scholar might be able to amplify or diminish particular voices. Goldenfein asks the question of what it means for Google to have become such a central actor in the scholarly ecosystem, how that might not only actively change or disrupt processes or search and evaluation, but also generate broader consequences for the academic field. He focuses on three main functions: Search, bibliometrics, and scholar profiles, which have become somewhat infrastructural in the scholarly ecosystem.

Google Scholar Bibliometrics

We’ve had commercially provided bibliometrics for a long time, however, the provision of bibliometrics by Google has a different character, which has the following different behaviors.

  • Index- The index for Google Scholar is vastly bigger than others. It indexes not just authoritative academic publishers, but everything it deems ‘scholarly’ from journals, free repositories, or blogs. Some have described this as ‘Switching from a controlled environment where the output, dissemination, and evaluation of scientific knowledge is monitored to an environment that lacks any kind of control other than a researcher's conscience.’

  • Relevance - While older approaches allow the user to define what is relevant, Google Scholar's relevance ranking system is opaque. The consensus seems to be that the primary ranking is by citation count, using automated citation indexing to determine ‘relevance’.

  • Citation - Google Scholar citations use unstructured web retrieval techniques built by automated software ‘parsers’ that identify and extract bibliometric material from scholarly content which pays no attention to structured data.

  • Profiles - Google Scholar is also able to link bibliometrics to researchers and produce researcher profiles. So, when searching for user profiles from Google Scholar, it gives automatically generated rich information about the researchers, publications, citation counts, and H-index.

The present work asks the following questions:

  1. What is the relationship of Google Scholar to new trends in scholarly communication and evaluation?

  2. What does it mean for the academic field that Google Scholar is not transparent?

  3. Does the automation of indexing undermine the values associated with scholarly evaluation or tertiary education?

  4. Are we OK with academic services being a two-sided market – i.e. the ‘platformization’ of the university?

  5. Is Google Scholar sufficiently accountable considering its importance to the academic community?

Contextual Integrity + Handoff

Goldenfein uses a framework of ethical and political analysis to go deep into the questions, which combines Contextual Integrity and the theory of value-function handoffs.

Historically, the ideas of the value of a scholarly document are based on citations of other documents. Thus, we could say that bibliometrics is the central tool that facilitates a competitive market among journals, researchers and universities, even it is not designed for the evaluation of journals. Before the digital platform has come into the academic system, this is how the structure works.

However, new digital platforms have entered the publishing ecosystem, along with organizational platforms, that have re-distributed the roles of actors in this ecosystem. Thus, we now need new political and ethical considerations.

  1. Due to the new interface and usability, people are more free to use and publish on the Google Scholar platform, which could erode disciplinary boundaries and legitimize "grey literature" or self-publication.

  2. Consideration is needed for infrastructural dominance or centrality. Re-distribution would make academics more reliant on the private platform, redefining the value of the document.

  3. Does automated generation of indexes and research profiles problematically automate dimensions of education and thus undermine values associated with knowledge transmission?

  4. Does the lack of transparency of Google Scholar problematically undermine the autonomy of the academic field, and by consequence undermine claims to objectivity?

One solution could be to ask for some accountability from Google Scholar as to the infrastructure it is providing. However, Google argues that this is not a product but a service, therefore Google is not responsible for its quality, how it's used, etc. The accountability could take the form of added customer service, proper document sourcing, error correction, or library liaisons.

On the other hand, the usability of Google Scholar should not prevent scholars from holding themselves to account. For an autonomous scholarly community, the key is to not stop being responsible for what we have provided and published.

Maybe, the erosion of scholarly norms because Google Scholar provides a convenient and usable alternative is a problem that we have to develop new systems to deal with.



Cornell Tech

2 W Loop Rd,

New York, NY 10044

Get Here >

DLI Queries

Jessie G. Taft