Research

Working Papers

Open Sourcing GPTs (draft available upon request)

Abstract: This paper explores the economic underpinnings of open-source (OS) contributions in artificial intelligence (AI) by major technology companies, focusing on large language models (LLMs) as a relevant example. An empirical analysis of patent data indicates that LLMs are compatible with the R&D portfolios of a large subset of firms with differentiated technologies, implying a wide range of industrial applications for LLMs. Additionally, this study documents an increase in LLM-related research activities following Meta’s open-source release of Llama, suggesting open-sourcing advanced LLMs can stimulate LLM-related R&D. Motivated by these findings, a theoretical framework is proposed to examine the factors influencing a profit-maximizing firm’s open-sourcing decision. By framing this decision as a trade-off between accelerating the growth of the LLM and securing direct financial returns, the analysis indicates an increased tendency toward open sourcing when the firm’s LLM only moderately surpasses existing OS counterparts. Additionally, the model predicts an inverted-U-shaped relationship between the firm’s propensity to open-source and the share of LLM-compatible applications it develops.

Distinguishing Biases from Personal Preferences: An `Honest’ Machine Learning Approach

(With Zahra Khanalizadeh and Negar Ziyaeian) Abstract: This study proposes a new method for estimating biases at the micro-level in scenarios with multiple bilateral interactions, where the presence of individual preferences and correlated characteristics complicates the analysis. The proposed method comprises two stages. In the first stage, the method introduces a novel approach to extract preferences and characteristics, employing Collaborative Filtering with an ‘honest’ design. This technique is designed to separate preferences and self-induced outcomes from the constructed embeddings of interacting units. In the second stage, the method utilizes a Double Machine Learning estimator to identify biases at the unit level, based on the embeddings generated in the first stage. The methodology was applied to a dataset of nearly 150,000 film ratings by professional critics, aiming to uncover personal biases among critics towards films directed by women.The results indicate that approximately 5\% of critics show a significant bias in favor of films directed by women, once personal preferences and film characteristics are accounted for. However, a ‘naive’ approach that ignores these elements suggests a much higher prevalence of bias among critics. //[Code]

Closing the Gap: Consequences of Almost Free Health Insurance on Healthcare Utilization and Infant Mortality in Iran.

(With Mohammad Vesal and Farshad Fatemi)
Abstract: In 2014, the Iranian government introduced Salamat Universal Insurance Program (SUIP) that offered almost free primary health insurance to every uninsured individual. Consequently, more than 10 million individuals were given primary health insurance by SUIP, and primary health insurance coverage rose by nearly 15 percentage points among the urban population. We assess the impact of SUIP on healthcare utilization and mortality of various age-groups among the urban population. The results indicate that the introduction of SUIP has been effective in improving the utilization of outpatient healthcare and reducing infant mortality. However, no discernible impact is found on the utilization of inpatient services and mortality of other age-groups. We attribute these findings to the insufficient coverage of primary insurance in the inpatient sector and the extremeness of mortality as a health indicator for these age-groups

Works in Progress

Measuring the Plurality of Online Discourse

(With Dirk Hovy and Carlo Schwarz) Summary: We proposes a new methodology to measure the plurality of online discourse. We formalize the concept of plurality as the semantic variance of online content. Based on this idea, we propose a family of measures that uses text embeddings from computational linguistics for the analysis of content plurality.

Gender Gaps: A Hollywood Story

(With Zahra Khanalizadeh) Summary: Using rich data on mainstream movies released between 1990 and 2021, we study gender gaps in the film industry, focusing on the outcomes of female-directed movies and the representation of female characters in the films using natural language processing techniques.

Mahyar Habibi